Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Skepticism and critical thinking is not panacea, but can help to understand the world better

Scriptorama: A Slightly Skeptical View
on Scripting Languages

News Introduction Recommended Books Recommended Links Programming Languages Design Recommended Papers Scripting Languages for Java vs. Pure Java
Software Engineering Anti-OO John Ousterhout Larry Wall Shell Giants Software Prototyping Software Life Cycle Models
Shells AWK Perl Perl Warts and Quirks Python PHP Javascript
Ruby Tcl/Tk R programming language Rexx Lua S-lang JVM-based scripting languages
Pipes Regex Program understanding Beautifiers and Pretty Printers Neatbash -- a simple bash prettyprinter Neatperl -- a simple Perl prettyprinter  
Brooks law Conway Law KISS Principle Featuritis Software Prototyping Unix Component Model  
Programming as a Profession Programming style Language design and programming quotes History Humor Random Findings Etc

This is the central page of the Softpanorama WEB site because I am strongly convinced that the development of scripting languages, not the replication of the efforts of BSD group undertaken by Stallman and Torvalds is the central part of open source. See Scripting languages as VHLL for more details.

 
Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in.

They're half technology and half religion. And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg.

Paul Graham: Beating the Averages

Libraries are more important that the language.

Donald Knuth


Introduction

A fruitful way to think about language development is to consider it a to be special type of theory building. Peter Naur suggested that programming in general is theory building activity in his 1985 paper "Programming as Theory Building". But idea is especially applicable to compilers and interpreters. What Peter Naur failed to understand was that design of programming languages has religious overtones and sometimes represent an activity, which is pretty close to the process of creating a new, obscure cult ;-). Clueless academics publishing junk papers at obscure conferences are high priests of the church of programming languages. Some, like Niklaus Wirth and Edsger W. Dijkstra, (temporary) reached the status close to (false) prophets :-).

On a deep conceptual level building of a new language is a human way of solving complex problems. That means that complier construction in probably the most underappreciated paradigm of programming of large systems. Much more so then greatly oversold object-oriented programming. OO benefits are greatly overstated.

For users, programming languages distinctly have religious aspects, so decisions about what language to use are often far from being rational and are mainly cultural.  Indoctrination at the university plays a very important role. Recently they were instrumental in making Java a new Cobol.

The second important observation about programming languages is that language per se is just a tiny part of what can be called language programming environment. The latter includes libraries, IDE, books, level of adoption at universities,  popular, important applications written in the language, level of support and key players that support the language on major platforms such as Windows and Linux and other similar things.

A mediocre language with good programming environment can give a run for the money to similar superior in design languages that are just naked.  This is  a story behind success of  Java and PHP. Critical application is also very important and this is a story of success of PHP which is nothing but a bastardatized derivative of Perl (with all the most interesting Perl features surgically removed ;-) adapted to creation of dynamic web sites using so called LAMP stack.

Progress in programming languages has been very uneven and contain several setbacks. Currently this progress is mainly limited to development of so called scripting languages.  Traditional high level languages field is stagnant for many decades.  From 2000 to 2017 we observed the huge sucess of Javascript; Python encroached in Perl territory (including genomics/bioinformatics) and R in turn start squeezing Python in several areas. At the same time Ruby despite initial success remained niche language.  PHP still holds its own in web-site design.

Some observations about scripting language design and  usage

At the same time there are some mysterious, unanswered question about factors that help the particular scripting language to increase its user base, or fail in popularity. Among them:

Nothing succeed like success

Those are difficult questions to answer without some way of classifying languages into different categories. Several such classifications exists. First of all like with natural languages, the number of people who speak a given language is a tremendous force that can overcome any real of perceived deficiencies of the language. In programming languages, like in natural languages nothing succeed like success.

The second interesting category is number of applications written in particular language that became part of Linux or, at least, are including in standard RHEL/FEDORA/CENTOS or Debian/Ubuntu repository.

The third relevant category is the number and quality of books for the particular language.

Complexity Curse

History of programming languages raises interesting general questions about the limit of complexity of programming languages. There is strong historical evidence that a language with simpler core, or even simplistic core Basic, Pascal) have better chances to acquire high level of popularity.

The underlying fact here probably is that most programmers are at best mediocre and such programmers tend on intuitive level to avoid more complex, more rich languages and prefer, say, Pascal to PL/1 and PHP to Perl. Or at least avoid it on a particular phase of language development (C++ is not simpler language then PL/1, but was widely adopted because of the progress of hardware, availability of compilers and not the least, because it was associated with OO exactly at the time OO became a mainstream fashion).

Complex non-orthogonal languages can succeed only as a result of a long period of language development (which usually adds complexly -- just compare Fortran IV with Fortran 99; or PHP 3 with PHP 5 ) from a smaller core. Attempts to ride some fashionable new trend extending existing popular language to this new "paradigm" also proved to be relatively successful (OO programming in case of C++, which is a superset of C).

Historically, few complex languages were successful (PL/1, Ada, Perl, C++), but even if they were successful, their success typically was temporary rather then permanent  (PL/1, Ada, Perl). As Professor Wilkes noted   (iee90):

Things move slowly in the computer language field but, over a sufficiently long period of time, it is possible to discern trends. In the 1970s, there was a vogue among system programmers for BCPL, a typeless language. This has now run its course, and system programmers appreciate some typing support. At the same time, they like a language with low level features that enable them to do things their way, rather than the compiler’s way, when they want to.

They continue, to have a strong preference for a lean language. At present they tend to favor C in its various versions. For applications in which flexibility is important, Lisp may be said to have gained strength as a popular programming language.

Further progress is necessary in the direction of achieving modularity. No language has so far emerged which exploits objects in a fully satisfactory manner, although C++ goes a long way. ADA was progressive in this respect, but unfortunately it is in the process of collapsing under its own great weight.

ADA is an example of what can happen when an official attempt is made to orchestrate technical advances. After the experience with PL/1 and ALGOL 68, it should have been clear that the future did not lie with massively large languages.

I would direct the reader’s attention to Modula-3, a modest attempt to build on the appeal and success of Pascal and Modula-2 [12].

Complexity of the compiler/interpreter also matter as it affects portability: this is one thing that probably doomed PL/1 (and later Ada), although those days a new language typically come with open source compiler (or in case of scripting languages, an interpreter) and this is less of a problem.

Here is an interesting take on language design from the preface to The D programming language book 9D language failed to achieve any significant level of popularity):

Programming language design seeks power in simplicity and, when successful, begets beauty.

Choosing the trade-offs among contradictory requirements is a difficult task that requires good taste from the language designer as much as mastery of theoretical principles and of practical implementation matters. Programming language design is software-engineering-complete.

D is a language that attempts to consistently do the right thing within the constraints it chose: system-level access to computing resources, high performance, and syntactic similarity with C-derived languages. In trying to do the right thing, D sometimes stays with tradition and does what other languages do, and other times it breaks tradition with a fresh, innovative solution. On occasion that meant revisiting the very constraints that D ostensibly embraced. For example, large program fragments or indeed entire programs can be written in a well-defined memory-safe subset of D, which entails giving away a small amount of system-level access for a large gain in program debuggability.

You may be interested in D if the following values are important to you:

The role of fashion

At the initial, the most difficult stage of language development the language should solve an important problem that was inadequately solved by currently popular languages.  But at the same time the language has few chances to succeed unless it perfectly fits into the current software fashion. This "fashion factor" is probably as important as several other factors combined. With the notable exclusion of "language sponsor" factor.  The latter can make or break the language.

Like in woman dress fashion rules in language design.  And with time this trend became more and more pronounced.  A new language should simultaneously represent the current fashionable trend.  For example OO-programming was a visit card into the world of "big, successful languages" since probably early 90th (C++, Java, Python).  Before that "structured programming" and "verification" (Pascal, Modula) played similar role.

Programming environment and the role of "powerful sponsor" in language success

PL/1, Java, C#, Ada, Python are languages that had powerful sponsors. Pascal, Basic, Forth, partially Perl (O'Reilly was a sponsor for a short period of time)  are examples of the languages that had no such sponsor during the initial period of development.  C and C++ are somewhere in between.

But language itself is not enough. Any language now need a "programming environment" which consists of a set of libraries, debugger and other tools (make tool, lint, pretty-printer, etc). The set of standard" libraries and debugger are probably two most important elements. They cost  lot of time (or money) to develop and here the role of powerful sponsor is difficult to underestimate.

While this is not the necessary condition for becoming popular, it really helps: other things equal the weight of the sponsor of the language does matter. For example Java, being a weak, inconsistent language (C-- with garbage collection and OO) was pushed through the throat on the strength of marketing and huge amount of money spend on creating Java programming environment. 

The same was partially true for  C# and Python. That's why Python, despite its "non-Unix" origin is more viable scripting language now then, say, Perl (which is better integrated with Unix and has pretty innovative for scripting languages support of pointers and regular expressions), or Ruby (which has support of coroutines from day 1, not as "bolted on" feature like in Python).

Like in political campaigns, negative advertizing also matter. For example Perl suffered greatly from blackmail comparing programs in it with "white noise". And then from withdrawal of O'Reilly from the role of sponsor of the language (although it continue to milk that Perl book publishing franchise ;-)

People proved to be pretty gullible and in this sense language marketing is not that different from woman clothing marketing :-)

Language level and success

One very important classification of programming languages is based on so called the level of the language.  Essentially after there is at least one language that is successful on a given level, the success of other languages on the same level became more problematic. Higher chances for success are for languages that have even slightly higher, but still higher level then successful predecessors.

The level of the language informally can be described as the number of statements (or, more correctly, the number of  lexical units (tokens)) needed to write a solution of a particular problem in one language versus another. This way we can distinguish several levels of programming languages:

 "Nanny languages" vs "Sharp razor" languages

Some people distinguish between "nanny languages" and "sharp razor" languages. The latter do not attempt to protect user from his errors while the former usually go too far... Right compromise is extremely difficult to find.

For example, I consider the explicit availability of pointers as an important feature of the language that greatly increases its expressive power and far outweighs risks of errors in hands of unskilled practitioners.  In other words attempts to make the language "safer" often misfire.

Expressive style of the languages

Another useful typology is based in expressive style of the language:

Those categories are not pure and somewhat overlap. For example, it's possible to program in an object-oriented style in C, or even assembler. Some scripting languages like Perl have built-in regular expressions engines that are a part of the language so they have functional component despite being procedural. Some relatively low level languages (Algol-style languages) implement garbage collection. A good example is Java. There are scripting languages that compile into common language framework which was designed for high level languages. For example, Iron Python compiles into .Net.

Weak correlation between quality of design and popularity

Popularity of the programming languages is not strongly connected to their quality. Some languages that look like a collection of language designer blunders (PHP, Java ) became quite popular. Java became  a new Cobol and PHP dominates dynamic Web sites construction. The dominant technology for such Web sites is often called LAMP, which means Linux - Apache -MySQL- PHP. Being a highly simplified but badly constructed subset of Perl, kind of new Basic for dynamic Web sites construction PHP provides the most depressing experience. I was unpleasantly surprised when I had learnt that the Wikipedia engine was rewritten in PHP from Perl some time ago, but this fact quite illustrates the trend.

So language design quality has little to do with the language success in the marketplace. Simpler languages have more wide appeal as success of PHP (which at the beginning was at the expense of Perl) suggests. In addition much depends whether the language has powerful sponsor like was the case with Java (Sun and IBM) as well as Python (Google).

Progress in programming languages has been very uneven and contain several setbacks like Java. Currently this progress is usually associated with scripting languages. History of programming languages raises interesting general questions about "laws" of programming language design. First let's reproduce several notable quotes:

  1. Knuth law of optimization: "Premature optimization is the root of all evil (or at least most of it) in programming." - Donald Knuth
  2. "Greenspun's Tenth Rule of Programming: any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp." - Phil Greenspun
  3. "The key to performance is elegance, not battalions of special cases." - Jon Bentley and Doug McIlroy
  4. "Some may say Ruby is a bad rip-off of Lisp or Smalltalk, and I admit that. But it is nicer to ordinary people." - Matz, LL2
  5. Most papers in computer science describe how their author learned what someone else already knew. - Peter Landin
  6. "The only way to learn a new programming language is by writing programs in it." - Kernighan and Ritchie
  7. "If I had a nickel for every time I've written "for (i = 0; i < N; i++)" in C, I'd be a millionaire." - Mike Vanier
  8. "Language designers are not intellectuals. They're not as interested in thinking as you might hope. They just want to get a language done and start using it." - Dave Moon
  9. "Don't worry about what anybody else is going to do. The best way to predict the future is to invent it." - Alan Kay
  10. "Programs must be written for people to read, and only incidentally for machines to execute." - Abelson & Sussman, SICP, preface to the first edition

Please note that one thing is to read language manual and appreciate how good the concepts are, and another to bet your project on a new, unproved language without good debuggers, manuals and, what is very important, libraries. Debugger is very important but standard libraries are crucial: they represent a factor that makes or breaks new languages.

In this sense languages are much like cars. For many people car is the thing that they use get to work and shopping mall and they are not very interesting is engine inline or V-type and the use of fuzzy logic in the transmission. What they care is safety, reliability, mileage, insurance and the size of trunk. In this sense "Worse is better" is very true. I already mentioned the importance of the debugger. The other important criteria is quality and availability of libraries. Actually libraries are what make 80% of the usability of the language, moreover in a sense libraries are more important than the language...

A popular belief that scripting is "unsafe" or "second rate" or "prototype" solution is completely wrong. If a project had died than it does not matter what was the implementation language, so for any successful project and tough schedules scripting language (especially in dual scripting language+C combination, for example TCL+C) is an optimal blend that for a large class of tasks. Such an approach helps to separate architectural decisions from implementation details much better that any OO model does.

Moreover even for tasks that handle a fair amount of computations and data (computationally intensive tasks) such languages as Python and Perl are often (but not always !) competitive with C++, C# and, especially, Java.

The second important observation about programming languages is that language per se is just a tiny part of what can be called language programming environment. the latter includes libraries, IDE, books, level of adoption at universities, popular, important applications written in the language, level of support and key players that support the language on major platforms such as Windows and Linux and other similar things. A mediocre language with good programming environment can give a run for the money to similar superior in design languages that are just naked. This is a story behind success of Java. Critical application is also very important and this is a story of success of PHP which is nothing but a bastardatized derivative of Perl (with all most interesting Perl features removed ;-) adapted to creation of dynamic web sites using so called LAMP stack.

History of programming languages raises interesting general questions about the limit of complexity of programming languages. There is strong historical evidence that languages with simpler core, or even simplistic core has more chanced to acquire high level of popularity. The underlying fact here probably is that most programmers are at best mediocre and such programmer tend on intuitive level to avoid more complex, more rich languages like, say, PL/1 and Perl. Or at least avoid it on a particular phase of language development (C++ is not simpler language then PL/1, but was widely adopted because OO became a fashion). Complex non-orthogonal languages can succeed only as a result on long period of language development from a smaller core or with the banner of some fashionable new trend (OO programming in case of C++).

Programming Language Development Timeline

Here is modified from Byte the timeline of Programming Languages (for the original see BYTE.com September 1995 / 20th Anniversary /)

Forties

ca. 1946


1949

Fifties


1951


1952

1957


1958


1959

Sixties


1960


1962


1963


1964


1965


1966


1967



1969

Seventies


1970


1972


1974


1975


1976


1977


1978


1979

Eighties


1980


1981


1982


1983


1984


1985


1986


1987


1988


1989

Nineties


1990


1991


1992


1993


1994


1995


1996


1997


2000


2006

2007 

2008:

2009:

2011

2017:

Special note on Scripting languages

Scripting helps to avoid OO trap that is pushed by
  "a hoard of practically illiterate researchers
publishing crap papers in junk conferences."

Despite the fact that scripting languages are really important computer science phenomena, they are usually happily ignored in university curriculums.  Students are usually indoctrinated (or in less politically correct terms  "brainwashed")  in Java and OO programming ;-)

This site tries to give scripting languages proper emphasis and  promotes scripting languages as an alternative to mainstream reliance on "Java as a new Cobol" approach for software development. Please read my introduction to the topic that was recently converted into the article: A Slightly Skeptical View on Scripting Languages.  

The tragedy of scripting language designer is that there is no way to overestimate the level of abuse of any feature of the language.  Half of the programmers by definition is below average and it is this half that matters most in enterprise environment.  In a way the higher is the level of programmer, the less relevant for him are limitations of the language. That's why statements like "Perl is badly suitable for large project development" are plain vanilla silly. With proper discipline it is perfectly suitable and programmers can be more productive with Perl than with Java. The real question is "What is the team quality and quantity?".  

Scripting is a part of Unix cultural tradition and Unix was the initial development platform for most of mainstream scripting languages with the exception of REXX. But they are portable and now all can be used in Windows and other OSes.

List of Softpanorama pages related to scripting languages

Standard topics

Main Representatives of the family Related topics History Etc

Different scripting languages provide different level of integration with base OS API (for example, Unix or Windows). For example Iron Python compiles into .Net and provides pretty high level of integration with Windows. The same is true about Perl and Unix: almost all Unix system calls are available directly from Perl. Moreover Perl integrates most of Unix API in a very natural way, making it perfect replacement of shell for coding complex scripts. It also have very good debugger. The latter is weak point of shells like bash and ksh93

Unix proved that treating everything like a file is a powerful OS paradigm. In a similar way scripting languages proved that "everything is a string" is also an extremely powerful programming paradigm.

Unix proved that treating everything like a file is a powerful OS paradigm. In a similar way scripting languages proved that "everything is a string" is also extremely powerful programming paradigm.

There are also several separate pages devoted to scripting in different applications. The main emphasis is on shells and Perl. Right now I am trying to convert my old Perl lecture notes into a eBook Nikolai Bezroukov. Introduction to Perl for Unix System Administrators.

Along with pages devoted to major scripting languages this site has many pages devoted to scripting in different applications.  There are more then a dozen of "Perl/Scripting tools for a particular area" type of pages. The most well developed and up-to-date pages of this set are probably Shells and Perl. This page main purpose is to follow the changes in programming practices that can be called the "rise of  scripting," as predicted in the famous John Ousterhout article Scripting: Higher Level Programming for the 21st Century in IEEE COMPUTER (1998). In this brilliant paper he wrote:

...Scripting languages such as Perl and Tcl represent a very different style of programming than system programming languages such as C or Java. Scripting languages are designed for "gluing" applications; they use typeless approaches to achieve a higher level of programming and more rapid application development than system programming languages. Increases in computer speed and changes in the application mix are making scripting languages more and more important for applications of the future.

...Scripting languages and system programming languages are complementary, and most major computing platforms since the 1960's have provided both kinds of languages. The languages are typically used together in component frameworks, where components are created with system programming languages and glued together with scripting languages. However, several recent trends, such as faster machines, better scripting languages, the increasing importance of graphical user interfaces and component architectures, and the growth of the Internet, have greatly increased the applicability of scripting languages. These trends will continue over the next decade, with more and more new applications written entirely in scripting languages and system programming languages used primarily for creating components.

My e-book Portraits of Open Source Pioneers contains several chapters on scripting (most are in early draft stage) that expand on this topic. 

The reader must understand that the treatment of the scripting languages in press, and especially academic press is far from being fair: entrenched academic interests often promote old or commercially supported paradigms until they retire, so change of paradigm often is possible only with the change of generations. And people tend to live longer those days... Please also be aware that even respectable academic magazines like Communications of ACM and IEEE Software often promote "Cargo cult software engineering" like Capability Maturity (CMM) model.

Dr. Nikolai Bezroukov


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

2007 2006 2005 2004 2003 2002 2001 2000 1999

[Oct 13, 2019] shell - Calling an external command in Python

Oct 13, 2019 | stackoverflow.com

Look at the subprocess module in the standard library:

import subprocess
subprocess.run(["ls", "-l"])

The advantage of subprocess vs. system is that it is more flexible (you can get the stdout , stderr , the "real" status code, better error handling, etc...).

The official documentation recommends the subprocess module over the alternative os.system() :

The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function [ os.system() ].

The Replacing Older Functions with the subprocess Module section in the subprocess documentation may have some helpful recipes.

For versions of Python before 3.5, use call :

import subprocess
subprocess.call(["ls", "-l"])

Jean ,Dec 3, 2018 at 6:00

Here's a summary of the ways to call external programs and the advantages and disadvantages of each:
  1. os.system("some_command with args") passes the command and arguments to your system's shell. This is nice because you can actually run multiple commands at once in this manner and set up pipes and input/output redirection. For example:
    os.system("some_command < input_file | another_command > output_file")

However, while this is convenient, you have to manually handle the escaping of shell characters such as spaces, etc. On the other hand, this also lets you run commands which are simply shell commands and not actually external programs. See the documentation .

  1. stream = os.popen("some_command with args") will do the same thing as os.system except that it gives you a file-like object that you can use to access standard input/output for that process. There are 3 other variants of popen that all handle the i/o slightly differently. If you pass everything as a string, then your command is passed to the shell; if you pass them as a list then you don't need to worry about escaping anything. See the documentation .
  2. The Popen class of the subprocess module. This is intended as a replacement for os.popen but has the downside of being slightly more complicated by virtue of being so comprehensive. For example, you'd say:
    print subprocess.Popen("echo Hello World", shell=True, stdout=subprocess.PIPE).stdout.read()
    

    instead of:

    print os.popen("echo Hello World").read()
    

    but it is nice to have all of the options there in one unified class instead of 4 different popen functions. See the documentation .

  3. The call function from the subprocess module. This is basically just like the Popen class and takes all of the same arguments, but it simply waits until the command completes and gives you the return code. For example:
    return_code = subprocess.call("echo Hello World", shell=True)

    See the documentation .

  4. If you're on Python 3.5 or later, you can use the new subprocess.run function, which is a lot like the above but even more flexible and returns a CompletedProcess object when the command finishes executing.
  5. The os module also has all of the fork/exec/spawn functions that you'd have in a C program, but I don't recommend using them directly.

The subprocess module should probably be what you use.

Finally please be aware that for all methods where you pass the final command to be executed by the shell as a string and you are responsible for escaping it. There are serious security implications if any part of the string that you pass can not be fully trusted. For example, if a user is entering some/any part of the string. If you are unsure, only use these methods with constants. To give you a hint of the implications consider this code:

print subprocess.Popen("echo %s " % user_input, stdout=PIPE).stdout.read()

and imagine that the user enters something "my mama didnt love me && rm -rf /" which could erase the whole filesystem.

jfs ,Dec 3, 2018 at 5:39

Typical implementation:
import subprocess

p = subprocess.Popen('ls', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in p.stdout.readlines():
    print line,
retval = p.wait()

You are free to do what you want with the stdout data in the pipe. In fact, you can simply omit those parameters ( stdout= and stderr= ) and it'll behave like os.system() .

maranas ,Feb 24 at 19:05

Some hints on detaching the child process from the calling one (starting the child process in background).

Suppose you want to start a long task from a CGI-script, that is the child process should live longer than the CGI-script execution process.

The classical example from the subprocess module docs is:

import subprocess
import sys

# some code here

pid = subprocess.Popen([sys.executable, "longtask.py"]) # call subprocess

# some more code here

The idea here is that you do not want to wait in the line 'call subprocess' until the longtask.py is finished. But it is not clear what happens after the line 'some more code here' from the example.

My target platform was freebsd, but the development was on windows, so I faced the problem on windows first.

On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.

The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:

DETACHED_PROCESS = 0x00000008

pid = subprocess.Popen([sys.executable, "longtask.py"],
                       creationflags=DETACHED_PROCESS).pid

/* UPD 2015.10.27 @eryksun in a comment below notes, that the semantically correct flag is CREATE_NEW_CONSOLE (0x00000010) */

On freebsd we have another problem: when the parent process is finished, it finishes the child processes as well. And that is not what you want in CGI-script either. Some experiments showed that the problem seemed to be in sharing sys.stdout. And the working solution was the following:

pid = subprocess.Popen([sys.executable, "longtask.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)

I have not checked the code on other platforms and do not know the reasons of the behaviour on freebsd. If anyone knows, please share your ideas. Googling on starting background processes in Python does not shed any light yet.

Daniel F ,Dec 4, 2018 at 8:36

I'd recommend using the subprocess module instead of os.system because it does shell escaping for you and is therefore much safer: http://docs.python.org/library/subprocess.html
subprocess.call(['ping', 'localhost'])

Fox Wilson ,Nov 7, 2017 at 23:19

import os
cmd = 'ls -al'
os.system(cmd)

If you want to return the results of the command, you can use os.popen . However, this is deprecated since version 2.6 in favor of the subprocess module , which other answers have covered well.

Peter Mortensen ,Dec 3, 2018 at 18:41

import os
os.system("your command")

Note that this is dangerous, since the command isn't cleaned. I leave it up to you to google for the relevant documentation on the 'os' and 'sys' modules. There are a bunch of functions (exec* and spawn*) that will do similar things.

Tom Fuller ,Oct 29, 2016 at 14:02

There are lots of different libraries which allow you to call external commands with Python. For each library I've given a description and shown an example of calling an external command. The command I used as the example is ls -l (list all files). If you want to find out more about any of the libraries I've listed and linked the documentation for each of them.

Sources:

These are all the libraries:

Hopefully this will help you make a decision on which library to use :)

subprocess

Subprocess allows you to call external commands and connect them to their input/output/error pipes (stdin, stdout, and stderr). Subprocess is the default choice for running commands, but sometimes other modules are better.

subprocess.run(["ls", "-l"]) # Run command
subprocess.run(["ls", "-l"], stdout=subprocess.PIPE) # This will run the command and return any output
subprocess.run(shlex.split("ls -l")) # You can also use the shlex library to split the command

os

os is used for "operating system dependent functionality". It can also be used to call external commands with os.system and os.popen (Note: There is also a subprocess.popen). os will always run the shell and is a simple alternative for people who don't need to, or don't know how to use subprocess.run .

os.system("ls -l") # run command
os.popen("ls -l").read() # This will run the command and return any output

sh

sh is a subprocess interface which lets you call programs as if they were functions. This is useful if you want to run a command multiple times.

sh.ls("-l") # Run command normally
ls_cmd = sh.Command("ls") # Save command as a variable
ls_cmd() # Run command as if it were a function

plumbum

plumbum is a library for "script-like" Python programs. You can call programs like functions as in sh . Plumbum is useful if you want to run a pipeline without the shell.

ls_cmd = plumbum.local("ls -l") # get command
ls_cmd() # run command

pexpect

pexpect lets you spawn child applications, control them and find patterns in their output. This is a better alternative to subprocess for commands that expect a tty on Unix.

pexpect.run("ls -l") # Run command as normal
child = pexpect.spawn('scp foo user@example.com:.') # Spawns child application
child.expect('Password:') # When this is the output
child.sendline('mypassword')

fabric

fabric is a Python 2.5 and 2.7 library. It allows you to execute local and remote shell commands. Fabric is simple alternative for running commands in a secure shell (SSH)

fabric.operations.local('ls -l') # Run command as normal
fabric.operations.local('ls -l', capture = True) # Run command and receive output

envoy

envoy is known as "subprocess for humans". It is used as a convenience wrapper around the subprocess module.

r = envoy.run("ls -l") # Run command
r.std_out # get output

commands

commands contains wrapper functions for os.popen , but it has been removed from Python 3 since subprocess is a better alternative.

The edit was based on J.F. Sebastian's comment.

Jorge E. Cardona ,Mar 13, 2012 at 0:12

I always use fabric for this things like:
from fabric.operations import local
result = local('ls', capture=True)
print "Content:/n%s" % (result, )

But this seem to be a good tool: sh (Python subprocess interface) .

Look an example:

from sh import vgdisplay
print vgdisplay()
print vgdisplay('-v')
print vgdisplay(v=True)

athanassis ,Oct 7, 2010 at 7:09

Check the "pexpect" Python library, too.

It allows for interactive controlling of external programs/commands, even ssh, ftp, telnet, etc. You can just type something like:

child = pexpect.spawn('ftp 192.168.0.24')

child.expect('(?i)name .*: ')

child.sendline('anonymous')

child.expect('(?i)password')

Bruno Bronosky ,Jan 30, 2018 at 18:18

If you need the output from the command you are calling, then you can use subprocess.check_output (Python 2.7+).
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
'crw-rw-rw- 1 root root 1, 3 Oct 18  2007 /dev/null\n'

Also note the shell parameter.

If shell is True , the specified command will be executed through the shell. This can be useful if you are using Python primarily for the enhanced control flow it offers over most system shells and still want convenient access to other shell features such as shell pipes, filename wildcards, environment variable expansion, and expansion of ~ to a user's home directory. However, note that Python itself offers implementations of many shell-like features (in particular, glob , fnmatch , os.walk() , os.path.expandvars() , os.path.expanduser() , and shutil ).

community wiki
5 revs
,Sep 19, 2018 at 13:55

With Standard Library

The Use subprocess module (Python 3):

import subprocess
subprocess.run(['ls', '-l'])

It is the recommended standard way. However, more complicated tasks (pipes, output, input, etc.) can be tedious to construct and write.

Note on Python version: If you are still using Python 2, subprocess.call works in a similar way.

ProTip: shlex.split can help you to parse the command for run , call , and other subprocess functions in case you don't want (or you can't!) provide them in form of lists:

import shlex
import subprocess
subprocess.run(shlex.split('ls -l'))
With External Dependencies

If you do not mind external dependencies, use plumbum :

from plumbum.cmd import ifconfig
print(ifconfig['wlan0']())

It is the best subprocess wrapper. It's cross-platform, i.e. it works on both Windows and Unix-like systems. Install by pip install plumbum .

Another popular library is sh :

from sh import ifconfig
print(ifconfig('wlan0'))

However, sh dropped Windows support, so it's not as awesome as it used to be. Install by pip install sh .

Eric ,Apr 2, 2014 at 13:07

This is how I run my commands. This code has everything you need pretty much
from subprocess import Popen, PIPE
cmd = "ls -l ~/"
p = Popen(cmd , shell=True, stdout=PIPE, stderr=PIPE)
out, err = p.communicate()
print "Return code: ", p.returncode
print out.rstrip(), err.rstrip()

Joe ,Nov 15, 2012 at 17:13

Update:

subprocess.run is the recommended approach as of Python 3.5 if your code does not need to maintain compatibility with earlier Python versions. It's more consistent and offers similar ease-of-use as Envoy. (Piping isn't as straightforward though. See this question for how .)

Here's some examples from the docs .

Run a process:

>>> subprocess.run(["ls", "-l"])  # doesn't capture output
CompletedProcess(args=['ls', '-l'], returncode=0)

Raise on failed run:

>>> subprocess.run("exit 1", shell=True, check=True)
Traceback (most recent call last):
  ...
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1

Capture output:

>>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE)
CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0,
stdout=b'crw-rw-rw- 1 root root 1, 3 Jan 23 16:23 /dev/null\n')
Original answer:

I recommend trying Envoy . It's a wrapper for subprocess, which in turn aims to replace the older modules and functions. Envoy is subprocess for humans.

Example usage from the readme :

>>> r = envoy.run('git config', data='data to pipe in', timeout=2)

>>> r.status_code
129
>>> r.std_out
'usage: git config [options]'
>>> r.std_err
''

Pipe stuff around too:

>>> r = envoy.run('uptime | pbcopy')

>>> r.command
'pbcopy'
>>> r.status_code
0

>>> r.history
[<Response 'uptime'>]

tripleee ,Dec 3, 2018 at 5:33

Without the output of the result:
import os
os.system("your command here")

With output of the result:

import commands
commands.getoutput("your command here")
or
commands.getstatusoutput("your command here")

Ben Hoffstein ,Sep 18, 2008 at 1:43

https://docs.python.org/2/library/subprocess.html

...or for a very simple command:

import os
os.system('cat testfile')

stuckintheshuck ,Oct 10, 2014 at 17:41

There is also Plumbum
>>> from plumbum import local
>>> ls = local["ls"]
>>> ls
LocalCommand(<LocalPath /bin/ls>)
>>> ls()
u'build.py\ndist\ndocs\nLICENSE\nplumbum\nREADME.rst\nsetup.py\ntests\ntodo.txt\n'
>>> notepad = local["c:\\windows\\notepad.exe"]
>>> notepad()                                   # Notepad window pops up
u''                                             # Notepad window is closed by user, command returns

tripleee ,Dec 3, 2018 at 5:36

os.system is OK, but kind of dated. It's also not very secure. Instead, try subprocess . subprocess does not call sh directly and is therefore more secure than os.system .

Get more information here .

James Hirschorn ,Dec 3, 2018 at 5:16

Calling an external command in Python

Simple, use subprocess.run , which returns a CompletedProcess object:

>>> import subprocess
>>> completed_process = subprocess.run('python --version')
Python 3.6.1 :: Anaconda 4.4.0 (64-bit)
>>> completed_process
CompletedProcess(args='python --version', returncode=0)
Why?

As of Python 3.5, the documentation recommends subprocess.run :

The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use cases, the underlying Popen interface can be used directly.

Here's an example of the simplest possible usage - and it does exactly as asked:

>>> import subprocess
>>> completed_process = subprocess.run('python --version')
Python 3.6.1 :: Anaconda 4.4.0 (64-bit)
>>> completed_process
CompletedProcess(args='python --version', returncode=0)

run waits for the command to successfully finish, then returns a CompletedProcess object. It may instead raise TimeoutExpired (if you give it a timeout= argument) or CalledProcessError (if it fails and you pass check=True ).

As you might infer from the above example, stdout and stderr both get piped to your own stdout and stderr by default.

We can inspect the returned object and see the command that was given and the returncode:

>>> completed_process.args
'python --version'
>>> completed_process.returncode
0
Capturing output

If you want to capture the output, you can pass subprocess.PIPE to the appropriate stderr or stdout :

>>> cp = subprocess.run('python --version', 
                        stderr=subprocess.PIPE, 
                        stdout=subprocess.PIPE)
>>> cp.stderr
b'Python 3.6.1 :: Anaconda 4.4.0 (64-bit)\r\n'
>>> cp.stdout
b''

(I find it interesting and slightly counterintuitive that the version info gets put to stderr instead of stdout.)

Pass a command list

One might easily move from manually providing a command string (like the question suggests) to providing a string built programmatically. Don't build strings programmatically. This is a potential security issue. It's better to assume you don't trust the input.

>>> import textwrap
>>> args = ['python', textwrap.__file__]
>>> cp = subprocess.run(args, stdout=subprocess.PIPE)
>>> cp.stdout
b'Hello there.\r\n  This is indented.\r\n'

Note, only args should be passed positionally.

Full Signature

Here's the actual signature in the source and as shown by help(run) :

def run(*popenargs, input=None, timeout=None, check=False, **kwargs):

The popenargs and kwargs are given to the Popen constructor. input can be a string of bytes (or unicode, if specify encoding or universal_newlines=True ) that will be piped to the subprocess's stdin.

The documentation describes timeout= and check=True better than I could:

The timeout argument is passed to Popen.communicate(). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated.

If check is true, and the process exits with a non-zero exit code, a CalledProcessError exception will be raised. Attributes of that exception hold the arguments, the exit code, and stdout and stderr if they were captured.

and this example for check=True is better than one I could come up with:

>>> subprocess.run("exit 1", shell=True, check=True)
Traceback (most recent call last):
  ...
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
Expanded Signature

Here's an expanded signature, as given in the documentation:

subprocess.run(args, *, stdin=None, input=None, stdout=None, stderr=None, 
shell=False, cwd=None, timeout=None, check=False, encoding=None, 
errors=None)

Note that this indicates that only the args list should be passed positionally. So pass the remaining arguments as keyword arguments.

Popen

When use Popen instead? I would struggle to find use-case based on the arguments alone. Direct usage of Popen would, however, give you access to its methods, including poll , 'send_signal', 'terminate', and 'wait'.

Here's the Popen signature as given in the source . I think this is the most precise encapsulation of the information (as opposed to help(Popen) ):

def __init__(self, args, bufsize=-1, executable=None,
             stdin=None, stdout=None, stderr=None,
             preexec_fn=None, close_fds=_PLATFORM_DEFAULT_CLOSE_FDS,
             shell=False, cwd=None, env=None, universal_newlines=False,
             startupinfo=None, creationflags=0,
             restore_signals=True, start_new_session=False,
             pass_fds=(), *, encoding=None, errors=None):

But more informative is the Popen documentation :

subprocess.Popen(args, bufsize=-1, executable=None, stdin=None,
                 stdout=None, stderr=None, preexec_fn=None, close_fds=True,
                 shell=False, cwd=None, env=None, universal_newlines=False,
                 startupinfo=None, creationflags=0, restore_signals=True,
                 start_new_session=False, pass_fds=(), *, encoding=None, errors=None)

Execute a child program in a new process. On POSIX, the class uses os.execvp()-like behavior to execute the child program. On Windows, the class uses the Windows CreateProcess() function. The arguments to Popen are as follows.

Understanding the remaining documentation on Popen will be left as an exercise for the reader.

Corey Goldberg ,Dec 9, 2015 at 18:13

Use:
import os

cmd = 'ls -al'

os.system(cmd)

os - This module provides a portable way of using operating system-dependent functionality.

For the more os functions, here is the documentation.

tripleee ,Dec 3, 2018 at 5:02

It can be this simple:
import os
cmd = "your command"
os.system(cmd)

tripleee ,Dec 3, 2018 at 4:59

use the os module
import os
os.system("your command")

eg

import os
os.system("ifconfig")

Peter Mortensen ,Jun 3, 2018 at 20:14

There is another difference here which is not mentioned previously.

subprocess.Popen executes the <command> as a subprocess. In my case, I need to execute file <a> which needs to communicate with another program, <b>.

I tried subprocess, and execution was successful. However <b> could not communicate with <a>. Everything is normal when I run both from the terminal.

One more: (NOTE: kwrite behaves different from other applications. If you try the below with Firefox, the results will not be the same.)

If you try os.system("kwrite") , program flow freezes until the user closes kwrite. To overcome that I tried instead os.system(konsole -e kwrite) . This time program continued to flow, but kwrite became the subprocess of the console.

Anyone runs the kwrite not being a subprocess (i.e. in the system monitor it must appear at the leftmost edge of the tree).

cdunn2001 ,Jan 18, 2011 at 19:21

subprocess.check_call is convenient if you don't want to test return values. It throws an exception on any error.

Emil Stenström ,Apr 30, 2014 at 14:37

I tend to use subprocess together with shlex (to handle escaping of quoted strings):
>>> import subprocess, shlex
>>> command = 'ls -l "/your/path/with spaces/"'
>>> call_params = shlex.split(command)
>>> print call_params
["ls", "-l", "/your/path/with spaces/"]
>>> subprocess.call(call_params)

mdwhatcott ,Aug 13, 2012 at 18:36

I quite like shell_command for its simplicity. It's built on top of the subprocess module.

Here's an example from the docs:

>>> from shell_command import shell_call
>>> shell_call("ls *.py")
setup.py  shell_command.py  test_shell_command.py
0
>>> shell_call("ls -l *.py")
-rw-r--r-- 1 ncoghlan ncoghlan  391 2011-12-11 12:07 setup.py
-rw-r--r-- 1 ncoghlan ncoghlan 7855 2011-12-11 16:16 shell_command.py
-rwxr-xr-x 1 ncoghlan ncoghlan 8463 2011-12-11 16:17 test_shell_command.py
0

Saurabh Bangad ,Jun 11, 2012 at 22:28

os.system does not allow you to store results, so if you want to store results in some list or something subprocess.call works.

houqp ,May 1, 2014 at 20:49

Shameless plug, I wrote a library for this :P https://github.com/houqp/shell.py

It's basically a wrapper for popen and shlex for now. It also supports piping commands so you can chain commands easier in Python. So you can do things like:

ex('echo hello shell.py') | "awk '{print $2}'"

tripleee ,Dec 3, 2018 at 5:43

Under Linux, in case you would like to call an external command that will execute independently (will keep running after the python script terminates), you can use a simple queue as task spooler or the at command

An example with task spooler:

import os
os.system('ts <your-command>')

Notes about task spooler ( ts ):

  1. You could set the number of concurrent processes to be run ("slots") with:

    ts -S <number-of-slots>

  2. Installing ts doesn't requires admin privileges. You can download and compile it from source with a simple make , add it to your path and you're done.

> ,Apr 16, 2013 at 20:23

You can use Popen, and then you can check the procedure's status:
from subprocess import Popen

proc = Popen(['ls', '-l'])
if proc.poll() is None:
    proc.kill()

Check out subprocess.Popen .

[Oct 13, 2019] what is the system function in python - Stack Overflow

Oct 13, 2019 | stackoverflow.com

what is the system function in python Ask Question Asked 9 years, 3 months ago Active 9 years, 3 months ago Viewed 6k times 1


Eva Feldman ,Jul 6, 2010 at 15:55

I want to play with system command in python . for example we have this function in perl : system("ls -la"); and its run ls -la what is the system function in python ? Thanks in Advance .

Felix Kling ,Jul 6, 2010 at 15:58

It is os.system :
import os
os.system('ls -la')

But this won't give you any output. So subprocess.check_output is probably more what you want:

>>> import subprocess
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
'crw-rw-rw- 1 root root 1, 3 Oct 18  2007 /dev/null\n'

KLee1 ,Jul 6, 2010 at 16:00

import os
os.system("")

From here

> ,

In the os module there is os.system() .

But if you want to do more advanced things with subprocesses the subprocess module provides a higher level interface with more possibilities that is usually preferable.

[Oct 13, 2019] python - Find size and free space of the filesystem containing a given file - Stack Overflow

Oct 13, 2019 | stackoverflow.com

Find size and free space of the filesystem containing a given file Ask Question Asked 8 years, 10 months ago Active 6 months ago Viewed 110k times 67 21


Piskvor ,Aug 21, 2013 at 7:19

I'm using Python 2.6 on Linux. What is the fastest way:

Sven Marnach ,May 5, 2016 at 11:11

If you just need the free space on a device, see the answer using os.statvfs() below.

If you also need the device name and mount point associated with the file, you should call an external program to get this information. df will provide all the information you need -- when called as df filename it prints a line about the partition that contains the file.

To give an example:

import subprocess
df = subprocess.Popen(["df", "filename"], stdout=subprocess.PIPE)
output = df.communicate()[0]
device, size, used, available, percent, mountpoint = \
    output.split("\n")[1].split()

Note that this is rather brittle, since it depends on the exact format of the df output, but I'm not aware of a more robust solution. (There are a few solutions relying on the /proc filesystem below that are even less portable than this one.)

Halfgaar ,Feb 9, 2017 at 10:41

This doesn't give the name of the partition, but you can get the filesystem statistics directly using the statvfs Unix system call. To call it from Python, use os.statvfs('/home/foo/bar/baz') .

The relevant fields in the result, according to POSIX :

unsigned long f_frsize   Fundamental file system block size. 
fsblkcnt_t    f_blocks   Total number of blocks on file system in units of f_frsize. 
fsblkcnt_t    f_bfree    Total number of free blocks. 
fsblkcnt_t    f_bavail   Number of free blocks available to 
                         non-privileged process.

So to make sense of the values, multiply by f_frsize :

import os
statvfs = os.statvfs('/home/foo/bar/baz')

statvfs.f_frsize * statvfs.f_blocks     # Size of filesystem in bytes
statvfs.f_frsize * statvfs.f_bfree      # Actual number of free bytes
statvfs.f_frsize * statvfs.f_bavail     # Number of free bytes that ordinary users
                                        # are allowed to use (excl. reserved space)

Halfgaar ,Feb 9, 2017 at 10:44

import os

def get_mount_point(pathname):
    "Get the mount point of the filesystem containing pathname"
    pathname= os.path.normcase(os.path.realpath(pathname))
    parent_device= path_device= os.stat(pathname).st_dev
    while parent_device == path_device:
        mount_point= pathname
        pathname= os.path.dirname(pathname)
        if pathname == mount_point: break
        parent_device= os.stat(pathname).st_dev
    return mount_point

def get_mounted_device(pathname):
    "Get the device mounted at pathname"
    # uses "/proc/mounts"
    pathname= os.path.normcase(pathname) # might be unnecessary here
    try:
        with open("/proc/mounts", "r") as ifp:
            for line in ifp:
                fields= line.rstrip('\n').split()
                # note that line above assumes that
                # no mount points contain whitespace
                if fields[1] == pathname:
                    return fields[0]
    except EnvironmentError:
        pass
    return None # explicit

def get_fs_freespace(pathname):
    "Get the free space of the filesystem containing pathname"
    stat= os.statvfs(pathname)
    # use f_bfree for superuser, or f_bavail if filesystem
    # has reserved space for superuser
    return stat.f_bfree*stat.f_bsize

Some sample pathnames on my computer:

path 'trash':
  mp /home /dev/sda4
  free 6413754368
path 'smov':
  mp /mnt/S /dev/sde
  free 86761562112
path '/usr/local/lib':
  mp / rootfs
  free 2184364032
path '/proc/self/cmdline':
  mp /proc proc
  free 0
PS

if on Python ≥3.3, there's shutil.disk_usage(path) which returns a named tuple of (total, used, free) expressed in bytes.

Xiong Chiamiov ,Sep 30, 2016 at 20:39

As of Python 3.3, there an easy and direct way to do this with the standard library:
$ cat free_space.py 
#!/usr/bin/env python3

import shutil

total, used, free = shutil.disk_usage(__file__)
print(total, used, free)

$ ./free_space.py 
1007870246912 460794834944 495854989312

These numbers are in bytes. See the documentation for more info.

Giampaolo Rodolà ,Aug 16, 2017 at 9:08

This should make everything you asked:
import os
from collections import namedtuple

disk_ntuple = namedtuple('partition',  'device mountpoint fstype')
usage_ntuple = namedtuple('usage',  'total used free percent')

def disk_partitions(all=False):
    """Return all mountd partitions as a nameduple.
    If all == False return phyisical partitions only.
    """
    phydevs = []
    f = open("/proc/filesystems", "r")
    for line in f:
        if not line.startswith("nodev"):
            phydevs.append(line.strip())

    retlist = []
    f = open('/etc/mtab', "r")
    for line in f:
        if not all and line.startswith('none'):
            continue
        fields = line.split()
        device = fields[0]
        mountpoint = fields[1]
        fstype = fields[2]
        if not all and fstype not in phydevs:
            continue
        if device == 'none':
            device = ''
        ntuple = disk_ntuple(device, mountpoint, fstype)
        retlist.append(ntuple)
    return retlist

def disk_usage(path):
    """Return disk usage associated with path."""
    st = os.statvfs(path)
    free = (st.f_bavail * st.f_frsize)
    total = (st.f_blocks * st.f_frsize)
    used = (st.f_blocks - st.f_bfree) * st.f_frsize
    try:
        percent = ret = (float(used) / total) * 100
    except ZeroDivisionError:
        percent = 0
    # NB: the percentage is -5% than what shown by df due to
    # reserved blocks that we are currently not considering:
    # http://goo.gl/sWGbH
    return usage_ntuple(total, used, free, round(percent, 1))


if __name__ == '__main__':
    for part in disk_partitions():
        print part
        print "    %s\n" % str(disk_usage(part.mountpoint))

On my box the code above prints:

giampaolo@ubuntu:~/dev$ python foo.py 
partition(device='/dev/sda3', mountpoint='/', fstype='ext4')
    usage(total=21378641920, used=4886749184, free=15405903872, percent=22.9)

partition(device='/dev/sda7', mountpoint='/home', fstype='ext4')
    usage(total=30227386368, used=12137168896, free=16554737664, percent=40.2)

partition(device='/dev/sdb1', mountpoint='/media/1CA0-065B', fstype='vfat')
    usage(total=7952400384, used=32768, free=7952367616, percent=0.0)

partition(device='/dev/sr0', mountpoint='/media/WB2PFRE_IT', fstype='iso9660')
    usage(total=695730176, used=695730176, free=0, percent=100.0)

partition(device='/dev/sda6', mountpoint='/media/Dati', fstype='fuseblk')
    usage(total=914217758720, used=614345637888, free=299872120832, percent=67.2)

AK47 ,Jul 7, 2016 at 10:37

The simplest way to find out it.
import os
from collections import namedtuple

DiskUsage = namedtuple('DiskUsage', 'total used free')

def disk_usage(path):
    """Return disk usage statistics about the given path.

    Will return the namedtuple with attributes: 'total', 'used' and 'free',
    which are the amount of total, used and free space, in bytes.
    """
    st = os.statvfs(path)
    free = st.f_bavail * st.f_frsize
    total = st.f_blocks * st.f_frsize
    used = (st.f_blocks - st.f_bfree) * st.f_frsize
    return DiskUsage(total, used, free)

tzot ,Aug 8, 2011 at 10:11

For the first point, you can try using os.path.realpath to get a canonical path, check it against /etc/mtab (I'd actually suggest calling getmntent , but I can't find a normal way to access it) to find the longest match. (to be sure, you should probably stat both the file and the presumed mountpoint to verify that they are in fact on the same device)

For the second point, use os.statvfs to get block size and usage information.

(Disclaimer: I have tested none of this, most of what I know came from the coreutils sources)

andrew ,Dec 15, 2017 at 0:55

For the second part of your question, "get usage statistics of the given partition", psutil makes this easy with the disk_usage(path) function. Given a path, disk_usage() returns a named tuple including total, used, and free space expressed in bytes, plus the percentage usage.

Simple example from documentation:

>>> import psutil
>>> psutil.disk_usage('/')
sdiskusage(total=21378641920, used=4809781248, free=15482871808, percent=22.5)

Psutil works with Python versions from 2.6 to 3.6 and on Linux, Windows, and OSX among other platforms.

Donald Duck ,Jan 12, 2018 at 18:28

import os

def disk_stat(path):
    disk = os.statvfs(path)
    percent = (disk.f_blocks - disk.f_bfree) * 100 / (disk.f_blocks -disk.f_bfree + disk.f_bavail) + 1
    return percent


print disk_stat('/')
print disk_stat('/data')

> ,

Usually the /proc directory contains such information in Linux, it is a virtual filesystem. For example, /proc/mounts gives information about current mounted disks; and you can parse it directly. Utilities like top , df all make use of /proc .

I haven't used it, but this might help too, if you want a wrapper: http://bitbucket.org/chrismiles/psi/wiki/Home

[Oct 09, 2019] oop - Perl Importing Variables From Calling Module - Stack Overflow

Oct 09, 2019 | stackoverflow.com

Perl Importing Variables From Calling Module Ask Question Asked 9 years, 1 month ago Active 9 years, 1 month ago Viewed 4k times 0 1


Russell C. ,Aug 31, 2010 at 20:31

I have a Perl module (Module.pm) that initializes a number of variables, some of which I'd like to import ($VAR2, $VAR3) into additional submodules that it might load during execution.

The way I'm currently setting up Module.pm is as follows:

package Module;

use warnings;
use strict;

use vars qw($SUBMODULES $VAR1 $VAR2 $VAR3);

require Exporter;
our @ISA = qw(Exporter);
our @EXPORT = qw($VAR2 $VAR3);

sub new {
    my ($package) = @_;
    my $self = {};
    bless ($self, $package);
    return $self;
}

sub SubModules1 {
    my $self = shift;
    if($SUBMODULES->{'1'}) { return $SUBMODULES->{'1'}; }

    # Load & cache submodule
    require Module::SubModule1;
    $SUBMODULES->{'1'} = Module::SubModule1->new(@_);    
    return $SUBMODULES->{'1'};
}

sub SubModules2 {
    my $self = shift;
    if($SUBMODULES->{'2'}) { return $SUBMODULES->{'2'}; }

    # Load & cache submodule
    require Module::SubModule2;
    $SUBMODULES->{'2'} = Module::SubModule2->new(@_);    
    return $SUBMODULES->{'2'};
}

Each submodule is structured as follows:

package Module::SubModule1;

use warnings;
use strict;
use Carp;

use vars qw();

sub new {
    my ($package) = @_;
    my $self = {};
    bless ($self, $package);
    return $self;
}

I want to be able to import the $VAR2 and $VAR3 variables into each of the submodules without having to reference them as $Module::VAR2 and $Module::VAR3. I noticed that the calling script is able to access both the variables that I have exported in Module.pm in the desired fashion but SubModule1.pm and SubModule2.pm still have to reference the variables as being from Module.pm.

I tried updating each submodule as follows which unfortunately didn't work I was hoping:

package Module::SubModule1;

use warnings;
use strict;
use Carp;

use vars qw($VAR2 $VAR3);

sub new {
    my ($package) = @_;
    my $self = {};
    bless ($self, $package);
    $VAR2 = $Module::VAR2;
    $VAR3 = $Module::VAR3;
    return $self;
}

Please let me know how I can successfully export $VAR2 and $VAR3 from Module.pm into each Submodule. Thanks in advance for your help!

Russell C. ,Aug 31, 2010 at 22:37

In your submodules, are you forgetting to say
use Module;

? Calling use Module from another package (say Module::Submodule9 ) will try to run the Module::import method. Since you don't have that method, it will call the Exporter::import method, and that is where the magic that exports Module 's variables into the Module::Submodule9 namespace will happen.


In your program there is only one Module namespace and only one instance of the (global) variable $Module::VAR2 . Exporting creates aliases to this variable in other namespaces, so the same variable can be accessed in different ways. Try this in a separate script:

package Whatever;
use Module;
use strict;
use vars qw($VAR2);

$Module::VAR2 = 5;
print $Whatever::VAR2;    # should be 5.
$VAR2 = 14;               # same as $Whatever::VAR2 = 14
print $Module::VAR2;      # should be 14

Russell C. ,Aug 31, 2010 at 21:38

Well there is the easy way:

In M.pm:

package M;

use strict;
use warnings;

#our is better than "use vars" for creating package variables
#it creates an alias to $M::foo named $foo in the current lexical scope 
our $foo = 5;

sub inM { print "$foo\n" }

1;

In M/S.pm

package M;

#creates an alias to $M::foo that will last for the entire scope,
#in this case the entire file
our $foo;

package M::S;

use strict;
use warnings;

sub inMS { print "$foo\n" }

1;

In the script:

#!/usr/bin/perl

use strict;
use warnings;

use M;
use M::S;

M::inM();
M::S::inMS();

But I would advise against this. Global variables are not a good practice, and sharing global variables between modules is even worse.

[Oct 09, 2019] gzip - How can I recover files from a corrupted .tar.gz archive - Stack Overflow

Oct 09, 2019 | stackoverflow.com

15


George ,Jun 24, 2016 at 2:49

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

JohnEye ,Oct 4, 2016 at 11:27

Recovery is possible but it depends on what caused the corruption.

If the file is just truncated, getting some partial result out is not too hard; just run

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question .

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

,

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:
gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

The solution was relatively simple using the unix dos2unix utility

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

[Oct 09, 2019] scope - What is the difference between my and our in Perl - Stack Overflow

Oct 09, 2019 | stackoverflow.com

Asked 10 years, 5 months ago Active 3 years, 1 month ago Viewed 107k times 180 56


Nathan Fellman ,May 10, 2009 at 10:24

I know what my is in Perl. It defines a variable that exists only in the scope of the block in which it is defined. What does our do? How does our differ from my ?

Nathan Fellman ,Nov 20, 2016 at 1:15

Great question: How does our differ from my and what does our do?

In Summary:

Available since Perl 5, my is a way to declare:


On the other hand, our variables are:


Declaring a variable with our allows you to predeclare variables in order to use them under use strict without getting typo warnings or compile-time errors. Since Perl 5.6, it has replaced the obsolete use vars , which was only file-scoped, and not lexically scoped as is our .

For example, the formal, qualified name for variable $x inside package main is $main::x . Declaring our $x allows you to use the bare $x variable without penalty (i.e., without a resulting error), in the scope of the declaration, when the script uses use strict or use strict "vars" . The scope might be one, or two, or more packages, or one small block.

Georg ,Oct 1, 2016 at 6:41

The PerlMonks and PerlDoc links from cartman and Olafur are a great reference - below is my crack at a summary:

my variables are lexically scoped within a single block defined by {} or within the same file if not in {} s. They are not accessible from packages/subroutines defined outside of the same lexical scope / block.

our variables are scoped within a package/file and accessible from any code that use or require that package/file - name conflicts are resolved between packages by prepending the appropriate namespace.

Just to round it out, local variables are "dynamically" scoped, differing from my variables in that they are also accessible from subroutines called within the same block.

Nathan Fellman ,Nov 20, 2015 at 18:46

An example:
use strict;

for (1 .. 2){
    # Both variables are lexically scoped to the block.
    our ($o);  # Belongs to 'main' package.
    my  ($m);  # Does not belong to a package.

    # The variables differ with respect to newness.
    $o ++;
    $m ++;
    print __PACKAGE__, " >> o=$o m=$m\n";  # $m is always 1.

    # The package has changed, but we still have direct,
    # unqualified access to both variables, because the
    # lexical scope has not changed.
    package Fubb;
    print __PACKAGE__, " >> o=$o m=$m\n";
}

# The our() and my() variables differ with respect to privacy.
# We can still access the variable declared with our(), provided
# that we fully qualify its name, but the variable declared
# with my() is unavailable.
print __PACKAGE__, " >> main::o=$main::o\n";  # 2
print __PACKAGE__, " >> main::m=$main::m\n";  # Undefined.

# Attempts to access the variables directly won't compile.
# print __PACKAGE__, " >> o=$o\n";
# print __PACKAGE__, " >> m=$m\n";

# Variables declared with use vars() are like those declared
# with our(): belong to a package; not private; and not new.
# However, their scoping is package-based rather than lexical.
for (1 .. 9){
    use vars qw($uv);
    $uv ++;
}

# Even though we are outside the lexical scope where the
# use vars() variable was declared, we have direct access
# because the package has not changed.
print __PACKAGE__, " >> uv=$uv\n";

# And we can access it from another package.
package Bubb;
print __PACKAGE__, " >> main::uv=$main::uv\n";

daotoad ,May 10, 2009 at 16:37

Coping with Scoping is a good overview of Perl scoping rules. It's old enough that our is not discussed in the body of the text. It is addressed in the Notes section at the end.

The article talks about package variables and dynamic scope and how that differs from lexical variables and lexical scope.

Chas. Owens ,Oct 7, 2013 at 14:02

my is used for local variables, where as our is used for global variables. More reading over Variable Scoping in Perl: the basics .

ruffin ,Feb 10, 2015 at 19:47

It's an old question, but I ever met some pitfalls about lexical declarations in Perl that messed me up, which are also related to this question, so I just add my summary here:

1. definition or declaration?

local $var = 42; 
print "var: $var\n";

The output is var: 42 . However we couldn't tell if local $var = 42; is a definition or declaration. But how about this:

use strict;
use warnings;

local $var = 42;
print "var: $var\n";

The second program will throw an error:

Global symbol "$var" requires explicit package name.

$var is not defined, which means local $var; is just a declaration! Before using local to declare a variable, make sure that it is defined as a global variable previously.

But why this won't fail?

use strict;
use warnings;

local $a = 42;
print "var: $a\n";

The output is: var: 42 .

That's because $a , as well as $b , is a global variable pre-defined in Perl. Remember the sort function?

2. lexical or global?

I was a C programmer before starting using Perl, so the concept of lexical and global variables seems straightforward to me: just corresponds to auto and external variables in C. But there're small differences:

In C, an external variable is a variable defined outside any function block. On the other hand, an automatic variable is a variable defined inside a function block. Like this:

int global;

int main(void) {
    int local;
}

While in Perl, things are subtle:

sub main {
    $var = 42;
}

&main;

print "var: $var\n";

The output is var: 42 , $var is a global variable even it's defined in a function block! Actually in Perl, any variable is declared as global by default.

The lesson is to always add use strict; use warnings; at the beginning of a Perl program, which will force the programmer to declare the lexical variable explicitly, so that we don't get messed up by some mistakes taken for granted.

Ólafur Waage ,May 10, 2009 at 10:25

The perldoc has a good definition of our.

Unlike my, which both allocates storage for a variable and associates a simple name with that storage for use within the current scope, our associates a simple name with a package variable in the current package, for use within the current scope. In other words, our has the same scoping rules as my, but does not necessarily create a variable.

Cosmicnet ,Nov 22, 2014 at 13:57

This is only somewhat related to the question, but I've just discovered a (to me) obscure bit of perl syntax that you can use with "our" (package) variables that you can't use with "my" (local) variables.
#!/usr/bin/perl

our $foo = "BAR";

print $foo . "\n";
${"foo"} = "BAZ";
print $foo . "\n";

Output:

BAR
BAZ

This won't work if you change 'our' to 'my'.

Okuma.Scott ,Sep 6, 2014 at 20:13

print "package is: " . __PACKAGE__ . "\n";
our $test = 1;
print "trying to print global var from main package: $test\n";

package Changed;

{
        my $test = 10;
        my $test1 = 11;
        print "trying to print local vars from a closed block: $test, $test1\n";
}

&Check_global;

sub Check_global {
        print "trying to print global var from a function: $test\n";
}
print "package is: " . __PACKAGE__ . "\n";
print "trying to print global var outside the func and from \"Changed\" package:     $test\n";
print "trying to print local var outside the block $test1\n";

Will Output this:

package is: main
trying to print global var from main package: 1
trying to print local vars from a closed block: 10, 11
trying to print global var from a function: 1
package is: Changed
trying to print global var outside the func and from "Changed" package: 1
trying to print local var outside the block

In case using "use strict" will get this failure while attempting to run the script:

Global symbol "$test1" requires explicit package name at ./check_global.pl line 24.
Execution of ./check_global.pl aborted due to compilation errors.

Nathan Fellman ,Nov 5, 2015 at 14:03

Just try to use the following program :
#!/usr/local/bin/perl
use feature ':5.10';
#use warnings;
package a;
{
my $b = 100;
our $a = 10;


print "$a \n";
print "$b \n";
}

package b;

#my $b = 200;
#our $a = 20 ;

print "in package b value of  my b $a::b \n";
print "in package b value of our a  $a::a \n";

Nathan Fellman ,May 16, 2013 at 11:07

#!/usr/bin/perl -l

use strict;

# if string below commented out, prints 'lol' , if the string enabled, prints 'eeeeeeeee'
#my $lol = 'eeeeeeeeeee' ;
# no errors or warnings at any case, despite of 'strict'

our $lol = eval {$lol} || 'lol' ;

print $lol;

Evgeniy ,Jan 27, 2016 at 4:57

Let us think what an interpreter actually is: it's a piece of code that stores values in memory and lets the instructions in a program that it interprets access those values by their names, which are specified inside these instructions. So, the big job of an interpreter is to shape the rules of how we should use the names in those instructions to access the values that the interpreter stores.

On encountering "my", the interpreter creates a lexical variable: a named value that the interpreter can access only while it executes a block, and only from within that syntactic block. On encountering "our", the interpreter makes a lexical alias of a package variable: it binds a name, which the interpreter is supposed from then on to process as a lexical variable's name, until the block is finished, to the value of the package variable with the same name.

The effect is that you can then pretend that you're using a lexical variable and bypass the rules of 'use strict' on full qualification of package variables. Since the interpreter automatically creates package variables when they are first used, the side effect of using "our" may also be that the interpreter creates a package variable as well. In this case, two things are created: a package variable, which the interpreter can access from everywhere, provided it's properly designated as requested by 'use strict' (prepended with the name of its package and two colons), and its lexical alias.

Sources:

[Oct 09, 2019] Perl Import Package in different Namespace - Stack Overflow

Oct 09, 2019 | stackoverflow.com

Perl Import Package in different Namespace Ask Question Asked 1 year ago Active 7 months ago Viewed 150 times We're doing things differently. View all 8 job openings! 2


choroba ,Sep 28, 2018 at 22:17

is it possible to import ( use ) a perl module within a different namespace?

Let's say I have a Module A (XS Module with no methods Exported @EXPORT is empty) and I have no way of changing the module.

This Module has a Method A::open

currently I can use that Module in my main program (package main) by calling A::open I would like to have that module inside my package main so that I can directly call open

I tried to manually push every key of %A:: into %main:: however that did not work as expected.

The only way that I know to achieve what I want is by using package A; inside my main program, effectively changing the package of my program from main to A . Im not satisfied with this. I would really like to keep my program inside package main.

Is there any way to achieve this and still keep my program in package main?

Offtopic: Yes I know usually you would not want to import everything into your namespace but this module is used by us extensively and we don't want to type A:: (well the actual module name is way longer which isn't making the situation better)in front of hundreds or thousands of calls

Grinnz ,Oct 1, 2018 at 6:26

This is one of those "impossible" situations, where the clear solution -- to rework that module -- is off limits.

But, you can alias that package's subs names, from its symbol table, to the same names in main . Worse than being rude, this comes with a glitch: it catches all names that that package itself imported in any way. However, since this package is a fixed quantity it stands to reason that you can establish that list (and even hard-code it). It is just this one time, right?

main

use warnings;
use strict;
use feature 'say';

use OffLimits;

GET_SUBS: {
    # The list of names to be excluded
    my $re_exclude = qr/^(?:BEGIN|import)$/;  # ...
    my @subs = grep { !/$re_exclude/ } sort keys %OffLimits::;
    no strict 'refs';
    for my $sub_name (@subs) {
        *{ $sub_name } = \&{ 'OffLimits::' . $sub_name };
    }   
};

my $name = name('name() called from ' . __PACKAGE__);
my $id   = id('id() called from ' . __PACKAGE__);

say "name() returned: $name";
say "id()   returned: $id";

with OffLimits.pm

package OffLimits;    
use warnings;
use strict;

sub name { return "In " .  __PACKAGE__ . ": @_" }
sub id   { return "In " .  __PACKAGE__ . ": @_" }

1;

It prints

name() returned: In OffLimits: name() called from  main
id()   returned: In OffLimits: id() called from  main

You may need that code in a BEGIN block, depending on other details.

Another option is of course to hard-code the subs to be "exported" (in @subs ). Given that the module is in practice immutable this option is reasonable and more reliable.


This can also be wrapped in a module, so that you have the normal, selective, importing.

WrapOffLimits.pm

package WrapOffLimits;
use warnings;
use strict;

use OffLimits;

use Exporter qw(import);

our @sub_names;
our @EXPORT_OK   = @sub_names;
our %EXPORT_TAGS = (all => \@sub_names);

BEGIN { 
    # Or supply a hard-coded list of all module's subs in @sub_names
    my $re_exclude = qr/^(?:BEGIN|import)$/;  # ...
    @sub_names = grep { !/$re_exclude/ } sort keys %OffLimits::;

    no strict 'refs';
    for my $sub_name (@sub_names) {
        *{ $sub_name } = \&{ 'OffLimits::' . $sub_name };
    }   
};
1;

and now in the caller you can import either only some subs

use WrapOffLimits qw(name);

or all

use WrapOffLimits qw(:all);

with otherwise the same main as above for a test.

The module name is hard-coded, which should be OK as this is meant only for that module.


The following is added mostly for completeness.

One can pass the module name to the wrapper by writing one's own import sub, which is what gets used then. The import list can be passed as well, at the expense of an awkward interface of the use statement.

It goes along the lines of

package WrapModule;
use warnings;
use strict;

use OffLimits;

use Exporter qw();  # will need our own import 

our ($mod_name, @sub_names);

our @EXPORT_OK   = @sub_names;
our %EXPORT_TAGS = (all => \@sub_names);

sub import {
    my $mod_name = splice @_, 1, 1;  # remove mod name from @_ for goto

    my $re_exclude = qr/^(?:BEGIN|import)$/;  # etc

    no strict 'refs';
    @sub_names = grep { !/$re_exclude/ } sort keys %{ $mod_name . '::'};    
    for my $sub_name (@sub_names) {    
        *{ $sub_name } = \&{ $mod_name . '::' . $sub_name };
    }   

    push @EXPORT_OK, @sub_names;

    goto &Exporter::import;
}
1;

what can be used as

use WrapModule qw(OffLimits name id);  # or (OffLimits :all)

or, with the list broken-up so to remind the user of the unusual interface

use WrapModule 'OffLimits', qw(name id);

When used with the main above this prints the same output.

The use statement ends up using the import sub defined in the module, which exports symbols by writing to the caller's symbol table. (If no import sub is written then the Exporter 's import method is nicely used, which is how this is normally done.)

This way we are able to unpack the arguments and have the module name supplied at use invocation. With the import list supplied as well now we have to push manually to @EXPORT_OK since this can't be in the BEGIN phase. In the end the sub is replaced by Exporter::import via the (good form of) goto , to complete the job.

Simerax ,Sep 30, 2018 at 10:19

You can forcibly "import" a function into main using glob assignment to alias the subroutine (and you want to do it in BEGIN so it happens at compile time, before calls to that subroutine are parsed later in the file):
use strict;
use warnings;
use Other::Module;

BEGIN { *open = \&Other::Module::open }

However, another problem you might have here is that open is a builtin function, which may cause some problems . You can add use subs 'open'; to indicate that you want to override the built-in function in this case, since you aren't using an actual import function to do so.

Grinnz ,Sep 30, 2018 at 17:33

Here is what I now came up with. Yes this is hacky and yes I also feel like I opened pandoras box with this. However at least a small dummy program ran perfectly fine.

I renamed the module in my code again. In my original post I used the example A::open actually this module does not contain any method/variable reserved by the perl core. This is why I blindly import everything here.

BEGIN {
    # using the caller to determine the parent. Usually this is main but maybe we want it somewhere else in some cases
    my ($parent_package) = caller;

    package A;

    foreach (keys(%A::)) {
        if (defined $$_) {
            eval '*'.$parent_package.'::'.$_.' = \$A::'.$_;
        }
        elsif (%$_) {
            eval '*'.$parent_package.'::'.$_.' = \%A::'.$_;
        }
        elsif (@$_) {
            eval '*'.$parent_package.'::'.$_.' = \@A::'.$_;
        }
        else {
            eval '*'.$parent_package.'::'.$_.' = \&A::'.$_;
        }
    }
}

[Oct 08, 2019] Perl constant array

Oct 08, 2019 | stackoverflow.com

Ask Question Asked 6 years, 1 month ago Active 4 years ago Viewed 5k times 4 1


Alec ,Sep 5, 2018 at 8:25

use constant {
    COLUMNS => qw/ TEST1 TEST2 TEST3 /,
}

Can I store an array using the constant package in Perl?

Whenever I go on to try to use the array like my @attr = (COLUMNS); , it does not contain the values.

Сухой27 ,Aug 12, 2013 at 13:37

use constant {
  COLUMNS => [qw/ TEST1 TEST2 TEST3 /],
};

print @{+COLUMNS};

> ,

Or remove the curly braces as the docs show :-
  1 use strict;
  2 use constant COLUMNS => qw/ TEST1 TEST2 TEST3 /;
  3 
  4 my @attr = (COLUMNS);
  5 print @attr;

which gives :-

 % perl test.pl
TEST1TEST2TEST3

Your code actually defines two constants COLUMNS and TEST2 :-

use strict;
use constant { COLUMNS => qw/ TEST1 TEST2 TEST3 /, };

my @attr = (COLUMNS);
print @attr;
print TEST2

and gives :-

% perl test.pl
TEST1TEST3

[Sep 30, 2019] python - Delete a file or folder - Stack Overflow

Sep 30, 2019 | stackoverflow.com

Zygimantas ,Aug 9, 2011 at 13:05

How to delete a file or folder in Python?

Lu55 ,Jul 10, 2018 at 13:52

os.remove() removes a file.

os.rmdir() removes an empty directory.

shutil.rmtree() deletes a directory and all its contents.


Path objects from the Python 3.4+ pathlib module also expose these instance methods:

Éric Araujo ,May 22 at 21:37

Python syntax to delete a file
import os
os.remove("/tmp/<file_name>.txt")

Or

import os
os.unlink("/tmp/<file_name>.txt")
Best practice
  1. First, check whether the file or folder exists or not then only delete that file. This can be achieved in two ways :
    a. os.path.isfile("/path/to/file")
    b. Use exception handling.

EXAMPLE for os.path.isfile

#!/usr/bin/python
import os
myfile="/tmp/foo.txt"

## If file exists, delete it ##
if os.path.isfile(myfile):
    os.remove(myfile)
else:    ## Show an error ##
    print("Error: %s file not found" % myfile)
Exception Handling
#!/usr/bin/python
import os

## Get input ##
myfile= raw_input("Enter file name to delete: ")

## Try to delete the file ##
try:
    os.remove(myfile)
except OSError as e:  ## if failed, report it back to the user ##
    print ("Error: %s - %s." % (e.filename, e.strerror))
RESPECTIVE OUTPUT
Enter file name to delete : demo.txt
Error: demo.txt - No such file or directory.

Enter file name to delete : rrr.txt
Error: rrr.txt - Operation not permitted.

Enter file name to delete : foo.txt
Python syntax to delete a folder
shutil.rmtree()

Example for shutil.rmtree()

#!/usr/bin/python
import os
import sys
import shutil

# Get directory name
mydir= raw_input("Enter directory name: ")

## Try to remove tree; if failed show an error using try...except on screen
try:
    shutil.rmtree(mydir)
except OSError as e:
    print ("Error: %s - %s." % (e.filename, e.strerror))

Paebbels ,Apr 25, 2016 at 19:38

Use
shutil.rmtree(path[, ignore_errors[, onerror]])

(See complete documentation on shutil ) and/or

os.remove

and

os.rmdir

(Complete documentation on os .)

Kaz ,Sep 8, 2018 at 22:37

Create a function for you guys.
def remove(path):
    """ param <path> could either be relative or absolute. """
    if os.path.isfile(path):
        os.remove(path)  # remove the file
    elif os.path.isdir(path):
        shutil.rmtree(path)  # remove dir and all contains
    else:
        raise ValueError("file {} is not a file or dir.".format(path))

[Sep 30, 2019] Get command line arguments as string

Sep 30, 2019 | stackoverflow.com

KocT9H ,Jun 6, 2016 at 12:58

I want to print all command line arguments as a single string. Example of how I call my script and what I expect to be printed:
./RunT.py mytst.tst -c qwerty.c

mytst.tst -c qwerty.c

The code that does that:

args = str(sys.argv[1:])
args = args.replace("[","")
args = args.replace("]","")
args = args.replace(",","")
args = args.replace("'","")
print args

I did all replaces because sys.argv[1:] returns this:

['mytst.tst', '-c', 'qwerty.c']

Is there a better way to get same result? I don't like those multiple replace calls

KocT9H ,Apr 5 at 12:16

An option:
import sys
' '.join(sys.argv[1:])

The join() function joins its arguments by whatever string you call it on. So ' '.join(...) joins the arguments with single spaces ( ' ' ) between them.

cxw ,Apr 5 at 12:18

The command line arguments are already handled by the shell before they are sent into sys.argv . Therefore, shell quoting and whitespace are gone and cannot be exactly reconstructed.

Assuming the user double-quotes strings with spaces, here's a python program to reconstruct the command string with those quotes.

commandstring = '';  

for arg in sys.argv[1:]:          # skip sys.argv[0] since the question didn't ask for it
    if ' ' in arg:
        commandstring+= '"{}"  '.format(arg) ;   # Put the quotes back in
    else:
        commandstring+="{}  ".format(arg) ;      # Assume no space => no quotes

print(commandstring);

For example, the command line

./saferm.py sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"

will produce the same arguments as output:

sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"

since the user indeed double-quoted only arguments with strings.

TryToSolveItSimple ,Apr 5 at 12:27

You're getting a list object with all of your arguments when you use the syntax [1:] which goes from the second argument to the last. You could run a for each loop to join them into one string:
args = sys.argv[1:]
result = ''

for arg in args:
    result += " " + arg

pushpen.paul ,Jul 13 at 1:39

None of the previous answers properly escape all possible arguments, like empty args or those containing quotes. The closest you can get with minimal code is to use shlex.quote (available since Python 3.3):
import shlex
cmdline = " ".join(map(shlex.quote, sys.argv[1:]))

[Sep 30, 2019] How can I convert a string to a number in Perl?

Sep 30, 2019 | stackoverflow.com

Ask Question Asked 10 years, 10 months ago Active 4 years, 7 months ago Viewed 264k times 81 7


TimK ,Mar 1, 2016 at 22:51

How would I convert a string holding a number into its numeric value in Perl?

OrangeDog ,May 22, 2014 at 15:24

You don't need to convert it at all:
% perl -e 'print "5.45" + 0.1;'
5.55

,

This is a simple solution:

Example 1

my $var1 = "123abc";
print $var1 + 0;

Result

123

Example 2

my $var2 = "abc123";
print $var2 + 0;

Result

0

[Sep 28, 2019] python - Get command line arguments as string - Stack Overflow

Sep 28, 2019 | stackoverflow.com

Get command line arguments as string Ask Question Asked 3 years, 3 months ago Active 2 months ago Viewed 21k times 7 1


KocT9H ,Jun 6, 2016 at 12:58

I want to print all command line arguments as a single string. Example of how I call my script and what I expect to be printed:
./RunT.py mytst.tst -c qwerty.c

mytst.tst -c qwerty.c

The code that does that:

args = str(sys.argv[1:])
args = args.replace("[","")
args = args.replace("]","")
args = args.replace(",","")
args = args.replace("'","")
print args

I did all replaces because sys.argv[1:] returns this:

['mytst.tst', '-c', 'qwerty.c']

Is there a better way to get same result? I don't like those multiple replace calls

KocT9H ,Apr 5 at 12:16

An option:
import sys
' '.join(sys.argv[1:])

The join() function joins its arguments by whatever string you call it on. So ' '.join(...) joins the arguments with single spaces ( ' ' ) between them.

cxw ,Apr 5 at 12:18

The command line arguments are already handled by the shell before they are sent into sys.argv . Therefore, shell quoting and whitespace are gone and cannot be exactly reconstructed.

Assuming the user double-quotes strings with spaces, here's a python program to reconstruct the command string with those quotes.

commandstring = '';  

for arg in sys.argv[1:]:          # skip sys.argv[0] since the question didn't ask for it
    if ' ' in arg:
        commandstring+= '"{}"  '.format(arg) ;   # Put the quotes back in
    else:
        commandstring+="{}  ".format(arg) ;      # Assume no space => no quotes

print(commandstring);

For example, the command line

./saferm.py sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"

will produce the same arguments as output:

sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"

since the user indeed double-quoted only arguments with strings.

TryToSolveItSimple ,Apr 5 at 12:27

You're getting a list object with all of your arguments when you use the syntax [1:] which goes from the second argument to the last. You could run a for each loop to join them into one string:
args = sys.argv[1:]
result = ''

for arg in args:
    result += " " + arg

pushpen.paul ,Jul 13 at 1:39

None of the previous answers properly escape all possible arguments, like empty args or those containing quotes. The closest you can get with minimal code is to use shlex.quote (available since Python 3.3):
import shlex
cmdline = " ".join(map(shlex.quote, sys.argv[1:]))

> ,

Or can do format , and multiply the string by length of sys.argv :
(' {}'*len(sys.argv)).lstrip().format(*sys.argv)

Or can do % :

(' %s'*len(sys.argv)).lstrip()%sys.argv

[Sep 24, 2019] How to properly use the try catch in perl that error.pm provides?

Sep 24, 2019 | stackoverflow.com

Sinan Ünür ,Apr 28, 2012 at 18:07

Last I checked, Error was deprecated. But here's how you would do it without that module:
eval {
    die "Oops!";
    1;
} or do {
    my $e = $@;
    print("Something went wrong: $e\n");
};

Basically, use eval instead of try , die instead of throw , and look for the exception in $@ . The true value at the end of the eval block is part of an idiom to prevent $@ from unintentionally changing before it is used again in Perl versions older than 5.14, see P::C::P::ErrorHandling::RequireCheckingReturnValueOfEval for details. For example, this code suffers from this flaw.

# BAD, DO NOT USE WITH PERLS OLDER THAN 5.14
eval {
    die "Oops!";
};
if (my $e = $@) {
    print("Something went wrong: $e\n");
}
# BAD, DO NOT USE WITH PERLS OLDER THAN 5.14

But note that many Perl operations do not raise exceptions when they fail; they simply return an error code. This behavior can be altered via autodie for builtins and standard modules. If you're using autodie , then the standard way of doing try/catch is this (straight out of the autodie perldoc):

use feature qw(switch);

eval {
   use autodie;

   open(my $fh, '<', $some_file);

   my @records = <$fh>;

   # Do things with @records...

   close($fh);

};

given ($@) {
   when (undef)   { say "No error";                    }
   when ('open')  { say "Error from open";             }
   when (':io')   { say "Non-open, IO error.";         }
   when (':all')  { say "All other autodie errors."    }
   default        { say "Not an autodie error at all." }
}

For getting a stacktrace, look at Carp .

> ,

If you want something a bit more powerful than Try::Tiny, you might want to try looking at the TryCatch module in CPAN.

[Sep 24, 2019] What is the best way to handle exceptions in Perl - Stack Overflow

Sep 24, 2019 | stackoverflow.com

Michael Carman ,Apr 8 at 9:52

The consensus of the Perl community seems to be that Try::Tiny is the preferred way of doing exception handling. The "lenient policy" you refer to is probably due to a combination of:

Note that the last item means that you'll see a lot of code like this:

eval { something() };
if ($@) {
    warn "Oh no! [$@]\n";
}

That's exception handling even though it doesn't use try/catch syntax. It's fragile, though, and will break in a number of subtle edge cases that most people don't think about.

Try::Tiny and the other exception handling modules on CPAN were written to make it easier to get right.

1. C does have setjmp() and longjmp() , which can be used for a very crude form of exception handling.

,

Never test $@ as is, because it is a global variable, so even the test itself can change it.

General eval-template:

my $result;

eval {
    $result= something();
    # ...
    1;  # ok
} or do {
    my $eval_error= $@ || "error";
    # ...
    die $eval_error;
};  # needs a semicolon

In practice that is the lightest way. It still leaves a tiny room for funny $@ behaviour, but nothing that really concerned me enough.

[Sep 24, 2019] How can I compile my Perl script so it can be executed on systems without perl installed - Stack Overflow

Sep 24, 2019 | stackoverflow.com

> ,

  1. Install PAR::Packer . Example for *nix:
    sudo cpan -i PAR::Packer
    

    For Strawberry Perl for Windows or for ActivePerl and MSVC installed:

    cpan -i PAR::Packer
    
  2. Pack it with pp . It will create an executable named "example" or "example.exe" on Windows.
    pp -o example example.pl
    

This would work only on the OS where it was built.

P.S. It is really hard to find a Unix clone without Perl. Did you mean Windows?

[Sep 21, 2019] How Did Perl Lose Ground to Bash?

Notable quotes:
"... It baffles me the most because the common objection to Perl is legibility. Even if you assume that the objection is made from ignorance - i.e. not even having looked at some Perl to gauge its legibility - the nonsense you see in a complex bash script is orders of magnitude worse! ..."
"... Maybe it's not reassuring to hear that, but I took an interest in Perl precisely because it's seen as an underdog and "dead" despite having experienced users and a lot of code, kind of like TCL, Prolog, or Ada. ..."
"... There's a long history of bad code written by mediocre developers who became the only one who could maintain the codebase until they no longer worked for the organization. The next poor sap to go in found a mess of a codebase and did their best to not break it further. After a few iterations, the whole thing is ready for /dev/null and Perl gets the blame. ..."
"... All in all, Perl is still my first go-to language, but there are definitely some things I wish it did better. ..."
"... The Perl leadership Osborned itself with Perl6. 20/20 hindsight says the new project should have been given a different name at conception, that way all the "watch this space -- under construction" signage wouldn't have steered people away from perfectly usable Perl5. Again, IMO. ..."
"... I don't observe the premise at all though. Is bash really gaining ground over anything recently? ..."
"... Python again is loved, because "taught by rote" idiots. Now you can give them pretty little packages. And it's no wonder they can do little better than be glorified system admins (which id rather have a real sys admin, since he's likely to understand Perl) ..."
"... Making a new language means lots of new training. Lots of profit in this. Nobody profits from writing new books on old languages. Lots of profit in general from supporting a new language. In the end, owning the language gets you profits. ..."
"... And I still don't get why tab for blocks python is even remotely more readable than Perl. ..."
"... If anything, JavaScript is pretty dang godly at what it does, I understand why that's popular. But I don't get python one bit, except to employ millions of entry level minions who can't think on their own. ..."
"... "Every teacher I know has students using it. We do it because it's an easy language, there's only one way to do it, and with whitespace as syntax it's easy to grade. We don't teach it because it is some powerful or exceptional language. " ..."
Sep 21, 2019 | www.reddit.com

How Did Perl Lose Ground to Bash?

Setting aside Perl vs. Python for the moment, how did Perl lose ground to Bash? It used to be that Bash scripts often got replaced by Perl scripts because Perl was more powerful. Even with very modern versions of Bash, Perl is much more powerful.

The Linux Standards Base (LSB) has helped ensure that certain tools are in predictable locations. Bash has gotten a bit more powerful since the release of 4.x, sure. Arrays, handicapped to 2-D arrays, have improved somewhat. There is a native regex engine in Bash 3.x, which admit is a big deal. There is also support for hash maps.

This is all good stuff for Bash. But, none of this is sufficient to explain why Perl isn't the thing you learn after Bash, or, after Bash and Python; take your pick. Thoughts?

28 comments 75% Upvoted What are your thoughts? Log in or Sign up log in sign up Sort by

oldmanwillow21 9 points · 9 days ago

Because Perl has suffered immensely in the popularity arena and is now viewed as undesirable. It's not that Bash is seen as an adequate replacement for Perl, that's where Python has landed.

emilper 8 points · 8 days ago

How did Perl5 lose ground to anything else?

Thusly

- "thou must use Moose for everything" -> "Perl is too slow" -> rewrite in Python because the architect loves Python -> Python is even slower -> architect shunned by the team and everything new written in Go, nobody dares to complain about speed now because the budget people don't trust them -> Perl is slow

- "globals are bad, singletons are good" -> spaghetti -> Perl is unreadable

- "lets use every single item from the gang of four book" -> insanity -> Perl is bad

- "we must be more OOP" -> everything is a faux object with everything else as attributes -> maintenance team quits and they all take PHP jobs, at least the PHP people know their place in the order of things and do less hype-driven-development -> Perl is not OOP enough

- "CGI is bad" -> app needs 6.54GB of RAM for one worker -> customer refuses to pay for more RAM, fires the team, picks a PHP team to do the next version -> PHP team laughs all the way to the bank, chanting "CGI is king"

recrof 2 points · 8 days ago

"CGI is bad" is real. PSGI or FCGI is much faster for web services, and if there are memory leaks, it's always possible to debug & fix them.

Grinnz 6 points · 8 days ago

CGI is fine, when it's all you need. There are many different use cases out there. Just don't use CGI.pm .

emilper 2 points · 7 days ago

memory leaks

memory leaks ... do huge monoliths count as "memory leaks" ?

Altreus 7 points · 8 days ago

It baffles me the most because the common objection to Perl is legibility. Even if you assume that the objection is made from ignorance - i.e. not even having looked at some Perl to gauge its legibility - the nonsense you see in a complex bash script is orders of magnitude worse!

Not to mention its total lack of common language features like first-class data and... Like, a compiler...

I no longer write bash scripts because it takes about 5 lines to become unmaintainable.

crashorbit 5 points · 9 days ago

Every language that reaches functional equity with Perl is perceived as better than it. Mostly because hey, at least it's not Perl.

oldmanwillow21 15 points · 9 days ago · edited 9 days ago

Jumbled mess of thoughts surely to follow.

When I discuss projects with peers and mention that I chose to develop in Perl, the responses range from passive bemusement, to scorn, to ridicule. The assumption is usually that I'm using a dead language that's crippled in functionality and uses syntax that will surely make everyone's eyes bleed to read. This is the culture everywhere from the casual hackers to the C-suite.

I've proven at work that I can write nontrivial software using Perl. I'm still asked to use Python or Go (edit: or node, ugh) for any project that'll have contributors from other teams, or to containerize apps using Docker to remove the need for Perl knowledge for end-users (no CPAN, carton, etc.). But I'll take what I can get, and now the attitude has gone from "get with the times" or "that's cute", to "ok but I don't expect everyone else to know it".

Perl has got a lot to offer, and I vastly enjoy using it over other languages I work with. I know that all the impassioned figures in the Perl community love it just the same, but the community's got some major fragmentation going on. I understand that everyone's got ideas about the future of the language, but is this really the best time to pull the community apart? I feel like if everyone was able to let go of their ego and put their heads together to bring us to a point of stability, even a place where we're not laughed at for professing our support for the language, it would be a major step in the right direction. I think we're heading to the bottom fast, otherwise.

In that spirit of togetherness, I think the language, particularly the community, needs to be made more accessible to newcomers. Not accessible to one Perl offshoot, but accessible to Perl. It needs to be decided what Perl means in today's day and age. What can it do? Why would I want to use it over another shiny language? What are the definitive places I can go to learn more? Who else will be there? How do I contribute and grow as a Perl developer? There need to be people talking about Perl in places that aren't necessarily hubs for other Perl enthusiasts. It needs to be something business decision-makers can look at and feel confident in using.

I really hope something changes. I'd be pretty sad if I had to spend the rest of my career writing whatever the trendy language of the day is. These are just observations from someone that likes writing Perl and has been watching from the sidelines.

PhloxPaniculata 2 points · 7 days ago

Maybe it's not reassuring to hear that, but I took an interest in Perl precisely because it's seen as an underdog and "dead" despite having experienced users and a lot of code, kind of like TCL, Prolog, or Ada.

Being able to read Modern Perl for free also helped a lot. I'm still lacking experience in Perl and I've yet to write anything of importance in it because I don't see an area in which it's clearly better than anything else, either because of the language, a package, or a framework, and I don't do a lot of text-munging anymore (I'm also a fan of awk so for small tasks it has the priority).

codon011 1 point · 9 days ago

Don't call it Perl. Unfortunately. Also IME multitasking in Perl5 (or the lack thereof and/or severe issues with) has been a detriment to it's standing in a "multithread all the things" world.

crashorbit 4 points · 8 days ago

So often I see people drag themselves down that "thread my app" path. Eventually realize that they are implementing a whole multi-processing operating system inside their app rather than taking advantage of the perfectly good one they are running on.

There are several perfectly good ways to do concurrency, multitasking, async IO and so on in perl. Many work well in the single node case and in the multi-node case. Anyone who tells you that multitasking systems are easy because of some implementation language choice has not made it through the whole Dunning Kruger cycle yet.

codon011 2 points · 8 days ago

Multithreading is never easy. The processors will always manage to do things in a "wrong" order unless you are very careful with your gatekeeping. However, other languages/frameworks have paradigms that make it seem easier such that those race conditions show up much later in your product lifecycle.

codon011 3 points · 9 days ago

There's a long history of bad code written by mediocre developers who became the only one who could maintain the codebase until they no longer worked for the organization. The next poor sap to go in found a mess of a codebase and did their best to not break it further. After a few iterations, the whole thing is ready for /dev/null and Perl gets the blame.

Bash has limitations, but that (usually) means fewer ways to mess it up. There's less domain knowledge to learn, (afaik) no CPAN equivalent, and fewer issues with things like "I need to upgrade this but I can't because this other thing uses this older version which is incompatible with the newer version so now we have to maintain two versions of the library and/or interpreter."

All in all, Perl is still my first go-to language, but there are definitely some things I wish it did better.

crb3 3 points · 9 days ago · edited 9 days ago

*[e:] Consider, not just core here, but CPAN pull-in as well. I had one project clobbered on a smaller-memory machine when I tried to set up a pure-Perl scp transfer -- there wasn't room enough for the full file to transfer if it was larger than about 50k, what with all the CPAN. Shelling to commandline scp worked just fine.

beermad 2 points · 8 days ago

To be fair, wrapping a Perl script around something that's (if I read your comment right) just running SCP is adding a pointless extra layer of complexity anyway.

It's a matter of using the best tool for each particular job, not just sticking with one. My own ~/bin directory has a big mix of Perl and pure shell, depending on the complexity of the job to be done.

crb3 2 points · 8 days ago · edited 7 days ago

Agreed; I brought that example up to illustrate the bulk issue. In it, I was feeling my way, not sure how much finagling I might have to do for the task (backdoor-passing legitimate sparse but possibly quite bulky email from one server to another), which is why I initially went for the pure-Perl approach, so I'd have the mechanics exposed for any needed hackery. The experience taught me to get by more on shelling to precompiled tooling where appropriate... and a healthy respect for CPAN pull-in, [e:] the way that this module depends on that module so it gets pulled in along with its dependencies in turn, and the pileup grows in memory. There was a time or two here and there where I only needed a teeny bit of what a module does, so I went in and studied the code, then implemented it internally as a function without the object's generalities and bulk. The caution learned on ancient x86 boxes now seems appropriate on ARM boards like rPi; what goes around comes around.

minimim 1 point · 4 days ago

wouldn't have steered people away from perfectly usable Perl5

Perl5 development was completely stalled at the time. Perl6 brought not only new blood into it's own effort, it reinvigorated Perl5 in the process.

It's completely backwards to suggest Perl 5 was fine until perl6 came along. It was almost dormant and became a lively language after Perl 6 was announced.

perlancar 2 points · 8 days ago

I don't observe the premise at all though. Is bash really gaining ground over anything recently? l

linearblade 3 points · 8 days ago

Perl is better than pretty much everything g out there at what it does.

But keep in mind,

They say C sharp is loved by everyone, when in reality it's Microsoft pushing their narrative and the army of "learn by rote" engineers In developing countries

Python again is loved, because "taught by rote" idiots. Now you can give them pretty little packages. And it's no wonder they can do little better than be glorified system admins (which id rather have a real sys admin, since he's likely to understand Perl)

Making a new language means lots of new training. Lots of profit in this. Nobody profits from writing new books on old languages. Lots of profit in general from supporting a new language. In the end, owning the language gets you profits.

And I still don't get why tab for blocks python is even remotely more readable than Perl.

If anything, JavaScript is pretty dang godly at what it does, I understand why that's popular. But I don't get python one bit, except to employ millions of entry level minions who can't think on their own.

duo-rotae 6 points · 8 days ago

I know a comp sci professor. I asked why he thought Python was so popular.

"Every teacher I know has students using it. We do it because it's an easy language, there's only one way to do it, and with whitespace as syntax it's easy to grade. We don't teach it because it is some powerful or exceptional language. "

Then he said if he really needs to get something done, it's Perl or C.

linearblade 2 points · 8 days ago

Yep that's pretty much my opinion from using it.

techsnapp 1 point · 2 days ago

So is per harder than python because the lack of everyone else using it?

duo-rotae 1 point · 2 days ago

Perl has a steeper and longer learning with it. curve than Python, and there is more than one way to do anything. And there quite a few that continue coding

[Sep 19, 2019] List::MoreUtils's minmax is more efficient when you need both the min and the max (because it does fewer comparisons).

Notable quotes:
"... List::MoreUtils's minmax is more efficient when you need both the min and the max (because it does fewer comparisons). ..."
Sep 19, 2019 | stackoverflow.com

List::Util's min and max are fine,

use List::Util qw( min max );
my $min = min @numbers;
my $max = max @numbers;

But List::MoreUtils's minmax is more efficient when you need both the min and the max (because it does fewer comparisons).

use List::MoreUtils qw( minmax );
my ($min, $max) = minmax @numbers;

List::Util is part of core, but List::MoreUtils isn't.

--ikegami

[Sep 17, 2019] How can a Perl regex re-use part of the previous match for the next match?

Sep 17, 2019 | stackoverflow.com

Ask Question Asked 10 years, 1 month ago Active 10 years, 1 month ago Viewed 2k times 2


dlw ,Aug 16, 2009 at 3:52

I need some Perl regular expression help. The following snippet of code:
use strict; 
use warnings; 
my $str = "In this example, A plus B equals C, D plus E plus F equals G and H plus I plus J plus K equals L"; 
my $word = "plus"; 
my @results = ();
1 while $str =~ s/(.{2}\b$word\b.{2})/push(@results,"$1\n")/e;
print @results;

Produces the following output:

A plus B
D plus E
2 plus F
H plus I
4 plus J
5 plus K

What I want to see is this, where a character already matched can appear in a new match in a different context:

A plus B
D plus E
E plus F
H plus I
I plus J
J plus K

How do I change the regular expression to get this result? Thanks --- Dan

Michael Carman ,Aug 16, 2009 at 4:11

General advice: Don't use s/// when you want m// . Be specific in what you match.

The answer is pos :

#!/usr/bin/perl -l

use strict;
use warnings;

my $str = 'In this example, ' . 'A plus B equals C, ' .
          'D plus E plus F equals G ' .
          'and H plus I plus J plus K equals L';

my $word = "plus";

my @results;

while ( $str =~ /([A-Z] $word [A-Z])/g ) {
    push @results, $1;
    pos($str) -= 1;
}

print "'$_'" for @results;

Output:

C:\Temp> b
'A plus B'
'D plus E'
'E plus F'
'H plus I'
'I plus J'
'J plus K'

Michael Carman ,Aug 16, 2009 at 2:56

You can use a m//g instead of s/// and assign to the pos function to rewind the match location before the second term:
use strict;
use warnings;

my $str  = 'In this example, A plus B equals C, D plus E plus F equals G and H plus I plus J plus K equals L';
my $word = 'plus';
my @results;

while ($str =~ /(.{2}\b$word\b(.{2}))/g) {
    push @results, "$1\n";
    pos $str -= length $2;
}
print @results;

dlw ,Aug 18, 2009 at 13:00

Another option is to use a lookahead:
use strict; 
use warnings; 
my $str = "In this example, A plus B equals C, D plus E "
        . "plus F equals G and H plus I plus J plus K equals L"; 
my $word = "plus"; 
my $chars = 2;
my @results = ();

push @results, $1 
  while $str =~ /(?=((.{0,$chars}?\b$word\b).{0,$chars}))\2/g;

print "'$_'\n" for @results;

Within the lookahead, capturing group 1 matches the word along with a variable number of leading and trailing context characters, up to whatever maximum you've set. When the lookahead finishes, the backreference \2 matches "for real" whatever was captured by group 2, which is the same as group 1 except that it stops at the end of the word. That sets pos where you want it, without requiring you to calculate how many characters you actually matched after the word.

ysth ,Aug 16, 2009 at 9:01

Given the "Full Disclosure" comment (but assuming .{0,35} , not .{35} ), I'd do
use List::Util qw/max min/;
my $context = 35;
while ( $str =~ /\b$word\b/g ) {
    my $pre = substr( $str, max(0, $-[0] - $context), min( $-[0], $context ) );
    my $post = substr( $str, $+[0], $context );
    my $match = substr( $str, $-[0], $+[0] - $-[0] );
    $pre =~ s/.*\n//s;
    $post =~ s/\n.*//s;
    push @results, "$pre$match$post";
}
print for @results;

You'd skip the substitutions if you really meant (?s:.{0,35}) .

Greg Hewgill ,Aug 16, 2009 at 2:29

Here's one way to do it:
use strict; 
use warnings; 
my $str = "In this example, A plus B equals C, D plus E plus F equals G and H plus I plus J plus K equals L"; 
my $word = "plus"; 
my @results = ();
my $i = 0;
while (substr($str, $i) =~ /(.{2}\b$word\b.{2})/) {
    push @results, "$1\n";
    $i += $-[0] + 1;
}
print @results;

It's not terribly Perl-ish, but it works and it doesn't use too many obscure regular expression tricks. However, you might have to look up the function of the special variable @- in perlvar .

ghostdog74 ,Aug 16, 2009 at 3:44

don't have to use regex. basically, just split up the string, use a loop to go over each items, check for "plus" , then get the word from before and after.
my $str = "In this example, A plus B equals C, D plus E plus F equals G and H plus I plus J plus K equals L"; 
@s = split /\s+/,$str;
for($i=0;$i<=scalar @s;$i++){
    if ( "$s[$i]"  eq "plus" ){
        print "$s[$i-1] plus $s[$i+1]\n";
    }
}

[Sep 16, 2019] How can I find the location of a regex match in Perl?

Notable quotes:
"... The built-in variables @- and @+ hold the start and end positions, respectively, of the last successful match. $-[0] and $+[0] correspond to entire pattern, while $-[N] and $+[N] correspond to the $N ( $1 , $2 , etc.) submatches. ..."
"... Edited to add: to quote from perlvar on $1..$9. "These variables are all read-only and dynamically scoped to the current BLOCK." In other words, if you want to use $1..$9, you cannot use a subroutine to do the matching. ..."
Sep 17, 2008 | stackoverflow.com

Michael Carman ,Sep 17, 2008 at 20:58

I need to write a function that receives a string and a regex. I need to check if there is a match and return the start and end location of a match. (The regex was already compiled by qr// .)

The function might also receive a "global" flag and then I need to return the (start,end) pairs of all the matches.

I cannot change the regex, not even add () around it as the user might use () and \1 . Maybe I can use (?:) .

Example: given "ababab" and the regex qr/ab/ , in the global case I need to get back 3 pairs of (start, end).

Nick T ,Sep 8, 2015 at 19:58

The built-in variables @- and @+ hold the start and end positions, respectively, of the last successful match. $-[0] and $+[0] correspond to entire pattern, while $-[N] and $+[N] correspond to the $N ( $1 , $2 , etc.) submatches.

szabgab ,Sep 17, 2008 at 23:51

Forget my previous post, I've got a better idea.
sub match_positions {
    my ($regex, $string) = @_;
    return if not $string =~ /$regex/;
    return ($-[0], $+[0]);
}
sub match_all_positions {
    my ($regex, $string) = @_;
    my @ret;
    while ($string =~ /$regex/g) {
        push @ret, [ $-[0], $+[0] ];
    }
    return @ret
}

This technique doesn't change the the regex in any way.

Edited to add: to quote from perlvar on $1..$9. "These variables are all read-only and dynamically scoped to the current BLOCK." In other words, if you want to use $1..$9, you cannot use a subroutine to do the matching.

Aftershock ,Dec 23, 2012 at 12:13

The pos function gives you the position of the match. If you put your regex in parentheses you can get the length (and thus the end) using length $1 . Like this
sub match_positions {
    my ($regex, $string) = @_;
    return if not $string =~ /($regex)/;
    return (pos($string), pos($string) + length $1);
}
sub all_match_positions {
    my ($regex, $string) = @_;
    my @ret;
    while ($string =~ /($regex)/g) {
        push @ret, [pos($string), pos($string) + length $1];
    }
    return @ret
}

zigdon ,Sep 17, 2008 at 20:43

You can also use the deprecated $` variable, if you're willing to have all the REs in your program execute slower. From perlvar:
   $'      The string preceding whatever was matched by the last successful pattern match (not
           counting any matches hidden within a BLOCK or eval enclosed by the current BLOCK).
           (Mnemonic: "`" often precedes a quoted string.)  This variable is read-only.

           The use of this variable anywhere in a program imposes a considerable performance penalty
           on all regular expression matches.  See "BUGS".

Shicheng Guo ,Jan 22, 2016 at 0:16

#!/usr/bin/perl

# search the postions for the CpGs in human genome

sub match_positions {
    my ($regex, $string) = @_;
    return if not $string =~ /($regex)/;
    return (pos($string), pos($string) + length $1);
}
sub all_match_positions {
    my ($regex, $string) = @_;
    my @ret;
    while ($string =~ /($regex)/g) {
        push @ret, [(pos($string)-length $1),pos($string)-1];
    }
    return @ret
}

my $regex='CG';
my $string="ACGACGCGCGCG";
my $cgap=3;    
my @pos=all_match_positions($regex,$string);

my @hgcg;

foreach my $pos(@pos){
    push @hgcg,@$pos[1];
}

foreach my $i(0..($#hgcg-$cgap+1)){
my $len=$hgcg[$i+$cgap-1]-$hgcg[$i]+2;
print "$len\n"; 
}

[Sep 16, 2019] How can I capture multiple matches from the same Perl regex - Stack Overflow

Sep 16, 2019 | stackoverflow.com

How can I capture multiple matches from the same Perl regex? Ask Question Asked 9 years, 4 months ago Active 7 years, 4 months ago Viewed 35k times 24 1


brian d foy ,May 22, 2010 at 15:42

I'm trying to parse a single string and get multiple chunks of data out from the same string with the same regex conditions. I'm parsing a single HTML doc that is static (For an undisclosed reason, I can't use an HTML parser to do the job.) I have an expression that looks like:
$string =~ /\<img\ssrc\="(.*)"/;

and I want to get the value of $1. However, in the one string, there are many img tags like this, so I need something like an array returned (@1?) is this possible?

VolatileRig ,Jan 14, 2014 at 19:41

As Jim's answer, use the /g modifier (in list context or in a loop).

But beware of greediness, you dont want the .* to match more than necessary (and dont escape < = , they are not special).

while($string =~ /<img\s+src="(.*?)"/g ) {
  ...
}

Robert Wohlfarth ,May 21, 2010 at 18:44

@list = ($string =~ m/\<img\ssrc\="(.*)"/g);

The g modifier matches all occurences in the string. List context returns all of the matches. See the m// operator in perlop .

dalton ,May 21, 2010 at 18:42

You just need the global modifier /g at the end of the match. Then loop through until there are no matches remaining
my @matches;
while ($string =~ /\<img\ssrc\="(.*)"/g) {
        push(@matches, $1);
}

VolatileRig ,May 24, 2010 at 16:37

Use the /g modifier and list context on the left, as in
@result = $string =~ /\<img\ssrc\="(.*)"/g;

[Sep 16, 2019] Pretty-print for shell script

Sep 16, 2019 | stackoverflow.com

Benoit ,Oct 21, 2010 at 13:19

I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc.

Do you know of one ?

Jamie ,Sep 11, 2012 at 3:00

Vim can indent bash scripts. But not reformat them before indenting.
Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!)

Though, some bugs with << (expecting EOF as first character on a line) e.g.

EDIT: ZZ not ZQ

Daniel Martí ,Apr 8, 2018 at 13:52

A bit late to the party, but it looks like shfmt could do the trick for you.

Brian Chrisman ,Sep 9 at 7:47

In bash I do this:
reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//"
}

this eliminates comments and reindents the script "bash way".

If you have HEREDOCS in your script, they got ruined by the sed in the previous function.

So use:

reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3"
}

But all your script will have a 4 spaces indentation.

Or you can do:

reindent () 
{ 
    rstr=$(mktemp -u "XXXXXXXXXX");
    source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}");
    echo '#!/bin/bash';
    declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/    /"
}

which takes care also of heredocs.

> ,

Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script .

Very nice, only one thing i took out is the [...]->test substitution.

[Sep 16, 2019] A command-line HTML pretty-printer Making messy HTML readable - Stack Overflow

Notable quotes:
"... Have a look at the HTML Tidy Project: http://www.html-tidy.org/ ..."
Sep 16, 2019 | stackoverflow.com

nisetama ,Aug 12 at 10:33

I'm looking for recommendations for HTML pretty printers which fulfill the following requirements:

> ,

Have a look at the HTML Tidy Project: http://www.html-tidy.org/

The granddaddy of HTML tools, with support for modern standards.

There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository .

Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.

For your needs, here is the command line to call Tidy:

[Sep 11, 2019] string - Extract substring in Bash - Stack Overflow

Sep 11, 2019 | stackoverflow.com

Jeff ,May 8 at 18:30

Given a filename in the form someletters_12345_moreleters.ext , I want to extract the 5 digits and put them into a variable.

So to emphasize the point, I have a filename with x number of characters then a five digit sequence surrounded by a single underscore on either side then another set of x number of characters. I want to take the 5 digit number and put that into a variable.

I am very interested in the number of different ways that this can be accomplished.

Berek Bryan ,Jan 24, 2017 at 9:30

Use cut :
echo 'someletters_12345_moreleters.ext' | cut -d'_' -f 2

More generic:

INPUT='someletters_12345_moreleters.ext'
SUBSTRING=$(echo $INPUT| cut -d'_' -f 2)
echo $SUBSTRING

JB. ,Jan 6, 2015 at 10:13

If x is constant, the following parameter expansion performs substring extraction:
b=${a:12:5}

where 12 is the offset (zero-based) and 5 is the length

If the underscores around the digits are the only ones in the input, you can strip off the prefix and suffix (respectively) in two steps:

tmp=${a#*_}   # remove prefix ending in "_"
b=${tmp%_*}   # remove suffix starting with "_"

If there are other underscores, it's probably feasible anyway, albeit more tricky. If anyone knows how to perform both expansions in a single expression, I'd like to know too.

Both solutions presented are pure bash, with no process spawning involved, hence very fast.

A Sahra ,Mar 16, 2017 at 6:27

Generic solution where the number can be anywhere in the filename, using the first of such sequences:
number=$(echo $filename | egrep -o '[[:digit:]]{5}' | head -n1)

Another solution to extract exactly a part of a variable:

number=${filename:offset:length}

If your filename always have the format stuff_digits_... you can use awk:

number=$(echo $filename | awk -F _ '{ print $2 }')

Yet another solution to remove everything except digits, use

number=$(echo $filename | tr -cd '[[:digit:]]')

sshow ,Jul 27, 2017 at 17:22

In case someone wants more rigorous information, you can also search it in man bash like this
$ man bash [press return key]
/substring  [press return key]
[press "n" key]
[press "n" key]
[press "n" key]
[press "n" key]

Result:

${parameter:offset}
       ${parameter:offset:length}
              Substring Expansion.  Expands to  up  to  length  characters  of
              parameter  starting  at  the  character specified by offset.  If
              length is omitted, expands to the substring of parameter  start‐
              ing at the character specified by offset.  length and offset are
              arithmetic expressions (see ARITHMETIC  EVALUATION  below).   If
              offset  evaluates  to a number less than zero, the value is used
              as an offset from the end of the value of parameter.  Arithmetic
              expressions  starting  with  a - must be separated by whitespace
              from the preceding : to be distinguished from  the  Use  Default
              Values  expansion.   If  length  evaluates to a number less than
              zero, and parameter is not @ and not an indexed  or  associative
              array,  it is interpreted as an offset from the end of the value
              of parameter rather than a number of characters, and the  expan‐
              sion is the characters between the two offsets.  If parameter is
              @, the result is length positional parameters beginning at  off‐
              set.   If parameter is an indexed array name subscripted by @ or
              *, the result is the length members of the array beginning  with
              ${parameter[offset]}.   A  negative  offset is taken relative to
              one greater than the maximum index of the specified array.  Sub‐
              string  expansion applied to an associative array produces unde‐
              fined results.  Note that a negative offset  must  be  separated
              from  the  colon  by  at least one space to avoid being confused
              with the :- expansion.  Substring indexing is zero-based  unless
              the  positional  parameters are used, in which case the indexing
              starts at 1 by default.  If offset  is  0,  and  the  positional
              parameters are used, $0 is prefixed to the list.

Aleksandr Levchuk ,Aug 29, 2011 at 5:51

Building on jor's answer (which doesn't work for me):
substring=$(expr "$filename" : '.*_\([^_]*\)_.*')

kayn ,Oct 5, 2015 at 8:48

I'm surprised this pure bash solution didn't come up:
a="someletters_12345_moreleters.ext"
IFS="_"
set $a
echo $2
# prints 12345

You probably want to reset IFS to what value it was before, or unset IFS afterwards!

zebediah49 ,Jun 4 at 17:31

Here's how i'd do it:
FN=someletters_12345_moreleters.ext
[[ ${FN} =~ _([[:digit:]]{5})_ ]] && NUM=${BASH_REMATCH[1]}

Note: the above is a regular expression and is restricted to your specific scenario of five digits surrounded by underscores. Change the regular expression if you need different matching.

TranslucentCloud ,Jun 16, 2014 at 13:27

Following the requirements

I have a filename with x number of characters then a five digit sequence surrounded by a single underscore on either side then another set of x number of characters. I want to take the 5 digit number and put that into a variable.

I found some grep ways that may be useful:

$ echo "someletters_12345_moreleters.ext" | grep -Eo "[[:digit:]]+" 
12345

or better

$ echo "someletters_12345_moreleters.ext" | grep -Eo "[[:digit:]]{5}" 
12345

And then with -Po syntax:

$ echo "someletters_12345_moreleters.ext" | grep -Po '(?<=_)\d+' 
12345

Or if you want to make it fit exactly 5 characters:

$ echo "someletters_12345_moreleters.ext" | grep -Po '(?<=_)\d{5}' 
12345

Finally, to make it be stored in a variable it is just need to use the var=$(command) syntax.

Darron ,Jan 9, 2009 at 16:13

Without any sub-processes you can:
shopt -s extglob
front=${input%%_+([a-zA-Z]).*}
digits=${front##+([a-zA-Z])_}

A very small variant of this will also work in ksh93.

user2350426

add a comment ,Aug 5, 2014 at 8:11
If we focus in the concept of:
"A run of (one or several) digits"

We could use several external tools to extract the numbers.
We could quite easily erase all other characters, either sed or tr:

name='someletters_12345_moreleters.ext'

echo $name | sed 's/[^0-9]*//g'    # 12345
echo $name | tr -c -d 0-9          # 12345

But if $name contains several runs of numbers, the above will fail:

If "name=someletters_12345_moreleters_323_end.ext", then:

echo $name | sed 's/[^0-9]*//g'    # 12345323
echo $name | tr -c -d 0-9          # 12345323

We need to use regular expresions (regex).
To select only the first run (12345 not 323) in sed and perl:

echo $name | sed 's/[^0-9]*\([0-9]\{1,\}\).*$/\1/'
perl -e 'my $name='$name';my ($num)=$name=~/(\d+)/;print "$num\n";'

But we could as well do it directly in bash (1) :

regex=[^0-9]*([0-9]{1,}).*$; \
[[ $name =~ $regex ]] && echo ${BASH_REMATCH[1]}

This allows us to extract the FIRST run of digits of any length
surrounded by any other text/characters.

Note : regex=[^0-9]*([0-9]{5,5}).*$; will match only exactly 5 digit runs. :-)

(1) : faster than calling an external tool for each short texts. Not faster than doing all processing inside sed or awk for large files.

codist ,May 6, 2011 at 12:50

Here's a prefix-suffix solution (similar to the solutions given by JB and Darron) that matches the first block of digits and does not depend on the surrounding underscores:
str='someletters_12345_morele34ters.ext'
s1="${str#"${str%%[[:digit:]]*}"}"   # strip off non-digit prefix from str
s2="${s1%%[^[:digit:]]*}"            # strip off non-digit suffix from s1
echo "$s2"                           # 12345

Campa ,Oct 21, 2016 at 8:12

I love sed 's capability to deal with regex groups:
> var="someletters_12345_moreletters.ext"
> digits=$( echo $var | sed "s/.*_\([0-9]\+\).*/\1/p" -n )
> echo $digits
12345

A slightly more general option would be not to assume that you have an underscore _ marking the start of your digits sequence, hence for instance stripping off all non-numbers you get before your sequence: s/[^0-9]\+\([0-9]\+\).*/\1/p .


> man sed | grep s/regexp/replacement -A 2
s/regexp/replacement/
    Attempt to match regexp against the pattern space.  If successful, replace that portion matched with replacement.  The replacement may contain the special  character  &  to
    refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.

More on this, in case you're not too confident with regexps:

All escapes \ are there to make sed 's regexp processing work.

Dan Dascalescu ,May 8 at 18:28

Given test.txt is a file containing "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
cut -b19-20 test.txt > test1.txt # This will extract chars 19 & 20 "ST" 
while read -r; do;
> x=$REPLY
> done < test1.txt
echo $x
ST

Alex Raj Kaliamoorthy ,Jul 29, 2016 at 7:41

My answer will have more control on what you want out of your string. Here is the code on how you can extract 12345 out of your string
str="someletters_12345_moreleters.ext"
str=${str#*_}
str=${str%_more*}
echo $str

This will be more efficient if you want to extract something that has any chars like abc or any special characters like _ or - . For example: If your string is like this and you want everything that is after someletters_ and before _moreleters.ext :

str="someletters_123-45-24a&13b-1_moreleters.ext"

With my code you can mention what exactly you want. Explanation:

#* It will remove the preceding string including the matching key. Here the key we mentioned is _ % It will remove the following string including the matching key. Here the key we mentioned is '_more*'

Do some experiments yourself and you would find this interesting.

Dan Dascalescu ,May 8 at 18:27

similar to substr('abcdefg', 2-1, 3) in php:
echo 'abcdefg'|tail -c +2|head -c 3

olibre ,Nov 25, 2015 at 14:50

Ok, here goes pure Parameter Substitution with an empty string. Caveat is that I have defined someletters and moreletters as only characters. If they are alphanumeric, this will not work as it is.
filename=someletters_12345_moreletters.ext
substring=${filename//@(+([a-z])_|_+([a-z]).*)}
echo $substring
12345

gniourf_gniourf ,Jun 4 at 17:33

There's also the bash builtin 'expr' command:
INPUT="someletters_12345_moreleters.ext"  
SUBSTRING=`expr match "$INPUT" '.*_\([[:digit:]]*\)_.*' `  
echo $SUBSTRING

russell ,Aug 1, 2013 at 8:12

A little late, but I just ran across this problem and found the following:
host:/tmp$ asd=someletters_12345_moreleters.ext 
host:/tmp$ echo `expr $asd : '.*_\(.*\)_'`
12345
host:/tmp$

I used it to get millisecond resolution on an embedded system that does not have %N for date:

set `grep "now at" /proc/timer_list`
nano=$3
fraction=`expr $nano : '.*\(...\)......'`
$debug nano is $nano, fraction is $fraction

> ,Aug 5, 2018 at 17:13

A bash solution:
IFS="_" read -r x digs x <<<'someletters_12345_moreleters.ext'

This will clobber a variable called x . The var x could be changed to the var _ .

input='someletters_12345_moreleters.ext'
IFS="_" read -r _ digs _ <<<"$input"

[Sep 10, 2019] How do I avoid an uninitialized value

Sep 10, 2019 | stackoverflow.com

marto ,Jul 15, 2011 at 16:52

I use this scrub function to clean up output from other functions.
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my %h = (
    a => 1,
    b => 1
    );

print scrub($h{c});

sub scrub {
    my $a = shift;

    return ($a eq '' or $a eq '~' or not defined $a) ? -1 : $a;
}

The problem occurs when I also would like to handle the case, where the key in a hash doesn't exist, which is shown in the example with scrub($h{c}) .

What change should be make to scrub so it can handle this case?

Sandra Schlichting ,Jun 22, 2017 at 19:00

You're checking whether $a eq '' before checking whether it's defined, hence the warning "Use of uninitialized value in string eq". Simply change the order of things in the conditional:
return (!defined($a) or $a eq '' or $a eq '~') ? -1 : $a;

As soon as anything in the chain of 'or's matches, Perl will stop processing the conditional, thus avoiding the erroneous attempt to compare undef to a string.

Sandra Schlichting ,Jul 14, 2011 at 14:34

In scrub it is too late to check, if the hash has an entry for key key . scrub() only sees a scalar, which is undef , if the hash key does not exist. But a hash could have an entry with the value undef also, like this:
my %h = (
 a => 1,
 b => 1,
 c => undef
);

So I suggest to check for hash entries with the exists function.

[Sep 10, 2019] How do I check if a Perl scalar variable has been initialized - Stack Overflow

Sep 10, 2019 | stackoverflow.com

How do I check if a Perl scalar variable has been initialized? Ask Question Asked 8 years, 11 months ago Active 3 years ago Viewed 49k times 33 10


brian d foy ,Sep 18, 2010 at 13:53

Is the following the best way to check if a scalar variable is initialized in Perl, using defined ?
my $var;

if (cond) {
    $var = "string1";
}

# Is this the correct way?
if (defined $var) {
    ...
}

mob ,Sep 25, 2010 at 21:35

Perl doesn't offer a way to check whether or not a variable has been initialized.

However, scalar variables that haven't been explicitly initialized with some value happen to have the value of undef by default. You are right about defined being the right way to check whether or not a variable has a value of undef .

There's several other ways tho. If you want to assign to the variable if it's undef , which your example code seems to indicate, you could, for example, use perl's defined-or operator:

$var //= 'a default value';

vol7ron ,Sep 17, 2010 at 23:17

It depends on what you're trying to do. The proper C way to do things is to initialize variables when they are declared; however, Perl is not C , so one of the following may be what you want:
  1)   $var = "foo" unless defined $var;      # set default after the fact
  2)   $var = defined $var? $var : {...};     # ternary operation
  3)   {...} if !(defined $var);              # another way to write 1)
  4)   $var = $var || "foo";                  # set to $var unless it's falsy, in which case set to 'foo'
  5)   $var ||= "foo";                        # retain value of $var unless it's falsy, in which case set to 'foo' (same as previous line)
  6)   $var = $var // "foo";                  # set to $var unless it's undefined, in which case set to 'foo'
  7)   $var //= "foo";                        # 5.10+ ; retain value of $var unless it's undefined, in which case set to 'foo' (same as previous line)


C way of doing things ( not recommended ):

# initialize the variable to a default value during declaration
#   then test against that value when you want to see if it's been changed
my $var = "foo";
{...}
if ($var eq "foo"){
   ... # do something
} else {
   ... # do something else
}

Another long-winded way of doing this is to create a class and a flag when the variable's been changed, which is unnecessary.

Axeman ,Sep 17, 2010 at 20:39

If you don't care whether or not it's empty, it is. Otherwise you can check
if ( length( $str || '' )) {}

swilliams ,Sep 17, 2010 at 20:53

It depends on what you plan on doing with the variable whether or not it is defined; as of Perl 5.10, you can do this (from perl51000delta ):

A new operator // (defined-or) has been implemented. The following expression:

 $a // $b

is merely equivalent to

defined $a ? $a : $b

and the statement

$c //= $d;

can now be used instead of

$c = $d unless defined $c;

rafl ,Jun 24, 2012 at 7:53

'defined' will return true if a variable has a real value.

As an aside, in a hash, this can be true:

if(exists $h{$e} && !defined $h{$e})

[Sep 10, 2019] logging - Perl - Output the log files - Stack Overflow

Aug 27, 2015 | stackoverflow.com

Perl - Output the log files Ask Question Asked 4 years ago Active 4 years ago Viewed 3k times 1 2


Arunesh Singh ,Aug 27, 2015 at 8:53

I have created a perl that telnet to multiple switches. I would like to check if telnet functions properly by telneting the switch.

This is my code to telnet to the switches:

#!/usr/bin/perl
use warnings;
use Net::Cisco;

open( OUTPUT, ">log.txt" );
open( SWITCHIP, "ip.txt" ) or die "couldn't open ip.txt";

my $count = 0;

while (<SWITCHIP>) {
    chomp($_);
    my $switch = $_;
    my $tl     = 0;
    my $t      = Net::Telnet::Cisco->new(
        Host => $switch,
        Prompt =>
            '/(?m:^(?:[\w.\/]+\:)?[\w.-]+\s?(?:\(config[^\)]*\))?\s?[\$#>]\s?(?:\(enable\))?\s*$)/',
        Timeout => 5,
        Errmode => 'return'
    ) or $tl = 1;

    my @output = ();
    if ( $tl != 1 ) {
        print "$switch Telnet success\n";
    }
    else {
        my $telnetstat = "Telnet Failed";
        print "$switch $telnetstat\n";
    }
    close(OUTPUT);
    $count++;
}

This is my output status after I was testing 7 switches:

10.xxx.3.17 Telnet success
10.xxx.10.12 Telnet success
10.xxx.136.10 Telnet success
10.xxx.136.12 Telnet success
10.xxx.188.188 Telnet Failed
10.xxx.136.13 Telnet success

I would like to convert the telnet result as log file.
How to separate successful and failed telnet results by using perl?

Danny Luk ,Aug 28, 2015 at 8:40

Please Try the following
#!/usr/bin/perl
use warnings;
use Net::Cisco;
################################### S
open( OUTPUTS, ">log_Success.txt" );
open( OUTPUTF, ">log_Fail.txt" );
################################### E
open( SWITCHIP, "ip.txt" ) or die "couldn't open ip.txt";

my $count = 0;

while (<SWITCHIP>) {
    chomp($_);
    my $switch = $_;
    my $tl     = 0;
    my $t      = Net::Telnet::Cisco->new(
        Host => $switch,
        Prompt =>
            '/(?m:^(?:[\w.\/]+\:)?[\w.-]+\s?(?:\(config[^\)]*\))?\s?[\$#>]\s?(?:\(enable\))?\s*$)/',
        Timeout => 5,
        Errmode => 'return'
    ) or $tl = 1;

    my @output = ();
################################### S
    if ( $tl != 1 ) {
        print "$switch Telnet success\n"; # for printing it in screen
        print OUTPUTS "$switch Telnet success\n"; # it will print it in the log_Success.txt
    }
    else {
        my $telnetstat = "Telnet Failed";
        print "$switch $telnetstat\n"; # for printing it in screen
        print OUTPUTF "$switch $telnetstat\n"; # it will print it in the log_Fail.txt
    }
################################### E
    $count++;
}
################################### S
close(SWITCHIP);
close(OUTPUTS);
close(OUTPUTF);
################################### E

Danny Luk ,Aug 28, 2015 at 8:39

In print statement after print just write the filehandle name which is OUTPUT in your code:
print OUTPUT "$switch Telnet success\n";

and

print OUTPUT "$switch $telnetstat\n";

A side note: always use a lexical filehandle and three arguments with error handling to open a file. This line open(OUTPUT, ">log.txt"); you can write like this:

open my $fhout, ">", "log.txt" or die $!;

Sobrique ,Aug 28, 2015 at 8:39

Use Sys::Syslog to write log messages.

But since you're opening a log.txt file with the handle OUTPUT , just change your two print statements to have OUTPUT as the first argument and the string as the next (without a comma).

my $telnetstat;
if($tl != 1) {
  $telnetstat = "Telnet success";
} else {
  $telnetstat = "Telnet Failed";
}
print OUTPUT "$switch $telnetstat\n";

# Or the shorter ternary operator line for all the above:
print OUTPUT $swtich . (!$tl ? " Telnet success\n" : " Telnet failed\n");

You might consider moving close to an END block:

END {
  close(OUTPUT);
}

Not only because it's in your while loop.

[Sep 08, 2019] How to replace spaces in file names using a bash script

Sep 08, 2019 | stackoverflow.com

Ask Question Asked 9 years, 4 months ago Active 2 months ago Viewed 226k times 238 127


Mark Byers ,Apr 25, 2010 at 19:20

Can anyone recommend a safe solution to recursively replace spaces with underscores in file and directory names starting from a given root directory? For example:
$ tree
.
|-- a dir
|   `-- file with spaces.txt
`-- b dir
    |-- another file with spaces.txt
    `-- yet another file with spaces.pdf

becomes:

$ tree
.
|-- a_dir
|   `-- file_with_spaces.txt
`-- b_dir
    |-- another_file_with_spaces.txt
    `-- yet_another_file_with_spaces.pdf

Jürgen Hötzel ,Nov 4, 2015 at 3:03

Use rename (aka prename ) which is a Perl script which may be on your system already. Do it in two steps:
find -name "* *" -type d | rename 's/ /_/g'    # do the directories first
find -name "* *" -type f | rename 's/ /_/g'

Based on Jürgen's answer and able to handle multiple layers of files and directories in a single bound using the "Revision 1.5 1998/12/18 16:16:31 rmb1" version of /usr/bin/rename (a Perl script):

find /tmp/ -depth -name "* *" -execdir rename 's/ /_/g' "{}" \;

oevna ,Jan 1, 2016 at 8:25

I use:
for f in *\ *; do mv "$f" "${f// /_}"; done

Though it's not recursive, it's quite fast and simple. I'm sure someone here could update it to be recursive.

The ${f// /_} part utilizes bash's parameter expansion mechanism to replace a pattern within a parameter with supplied string. The relevant syntax is ${parameter/pattern/string} . See: https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html or http://wiki.bash-hackers.org/syntax/pe .

armandino ,Dec 3, 2013 at 20:51

find . -depth -name '* *' \
| while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done

failed to get it right at first, because I didn't think of directories.

Edmund Elmer ,Jul 3 at 7:12

you can use detox by Doug Harple
detox -r <folder>

Dennis Williamson ,Mar 22, 2012 at 20:33

A find/rename solution. rename is part of util-linux.

You need to descend depth first, because a whitespace filename can be part of a whitespace directory:

find /tmp/ -depth -name "* *" -execdir rename " " "_" "{}" ";"

armandino ,Apr 26, 2010 at 11:49

bash 4.0
#!/bin/bash
shopt -s globstar
for file in **/*\ *
do 
    mv "$file" "${file// /_}"       
done

Itamar ,Jan 31, 2013 at 21:27

you can use this:
    find . -name '* *' | while read fname 

do
        new_fname=`echo $fname | tr " " "_"`

        if [ -e $new_fname ]
        then
                echo "File $new_fname already exists. Not replacing $fname"
        else
                echo "Creating new file $new_fname to replace $fname"
                mv "$fname" $new_fname
        fi
done

yabt ,Apr 26, 2010 at 14:54

Here's a (quite verbose) find -exec solution which writes "file already exists" warnings to stderr:
function trspace() {
   declare dir name bname dname newname replace_char
   [ $# -lt 1 -o $# -gt 2 ] && { echo "usage: trspace dir char"; return 1; }
   dir="${1}"
   replace_char="${2:-_}"
   find "${dir}" -xdev -depth -name $'*[ \t\r\n\v\f]*' -exec bash -c '
      for ((i=1; i<=$#; i++)); do
         name="${@:i:1}"
         dname="${name%/*}"
         bname="${name##*/}"
         newname="${dname}/${bname//[[:space:]]/${0}}"
         if [[ -e "${newname}" ]]; then
            echo "Warning: file already exists: ${newname}" 1>&2
         else
            mv "${name}" "${newname}"
         fi
      done
  ' "${replace_char}" '{}' +
}

trspace rootdir _

degi ,Aug 8, 2011 at 9:10

This one does a little bit more. I use it to rename my downloaded torrents (no special characters (non-ASCII), spaces, multiple dots, etc.).
#!/usr/bin/perl

&rena(`find . -type d`);
&rena(`find . -type f`);

sub rena
{
    ($elems)=@_;
    @t=split /\n/,$elems;

    for $e (@t)
    {
    $_=$e;
    # remove ./ of find
    s/^\.\///;
    # non ascii transliterate
    tr [\200-\377][_];
    tr [\000-\40][_];
    # special characters we do not want in paths
    s/[ \-\,\;\?\+\'\"\!\[\]\(\)\@\#]/_/g;
    # multiple dots except for extension
    while (/\..*\./)
    {
        s/\./_/;
    }
    # only one _ consecutive
    s/_+/_/g;
    next if ($_ eq $e ) or ("./$_" eq $e);
    print "$e -> $_\n";
    rename ($e,$_);
    }
}

Junyeop Lee ,Apr 10, 2018 at 9:44

Recursive version of Naidim's Answers.
find . -name "* *" | awk '{ print length, $0 }' | sort -nr -s | cut -d" " -f2- | while read f; do base=$(basename "$f"); newbase="${base// /_}"; mv "$(dirname "$f")/$(basename "$f")" "$(dirname "$f")/$newbase"; done

ghoti ,Dec 5, 2016 at 21:16

I found around this script, it may be interesting :)
 IFS=$'\n';for f in `find .`; do file=$(echo $f | tr [:blank:] '_'); [ -e $f ] && [ ! -e $file ] && mv "$f" $file;done;unset IFS

ghoti ,Dec 5, 2016 at 21:17

Here's a reasonably sized bash script solution
#!/bin/bash
(
IFS=$'\n'
    for y in $(ls $1)
      do
         mv $1/`echo $y | sed 's/ /\\ /g'` $1/`echo "$y" | sed 's/ /_/g'`
      done
)

user1060059 ,Nov 22, 2011 at 15:15

This only finds files inside the current directory and renames them . I have this aliased.

find ./ -name "* *" -type f -d 1 | perl -ple '$file = $_; $file =~ s/\s+/_/g; rename($_, $file);

Hongtao ,Sep 26, 2014 at 19:30

I just make one for my own purpose. You may can use it as reference.
#!/bin/bash
cd /vzwhome/c0cheh1/dev_source/UB_14_8
for file in *
do
    echo $file
    cd "/vzwhome/c0cheh1/dev_source/UB_14_8/$file/Configuration/$file"
    echo "==> `pwd`"
    for subfile in *\ *; do [ -d "$subfile" ] && ( mv "$subfile" "$(echo $subfile | sed -e 's/ /_/g')" ); done
    ls
    cd /vzwhome/c0cheh1/dev_source/UB_14_8
done

Marcos Jean Sampaio ,Dec 5, 2016 at 20:56

For files in folder named /files
for i in `IFS="";find /files -name *\ *`
do
   echo $i
done > /tmp/list


while read line
do
   mv "$line" `echo $line | sed 's/ /_/g'`
done < /tmp/list

rm /tmp/list

Muhammad Annaqeeb ,Sep 4, 2017 at 11:03

For those struggling through this using macOS, first install all the tools:
 brew install tree findutils rename

Then when needed to rename, make an alias for GNU find (gfind) as find. Then run the code of @Michel Krelin:

alias find=gfind 
find . -depth -name '* *' \
| while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done

[Sep 07, 2019] As soon as you stop writing code on a regular basis you stop being a programmer. You lose you qualification very quickly. That's a typical tragedy of talented programmers who became mediocre managers or, worse, theoretical computer scientists

Programming skills are somewhat similar to the skills of people who play violin or piano. As soon a you stop playing violin or piano still start to evaporate. First slowly, then quicker. In two yours you probably will lose 80%.
Notable quotes:
"... I happened to look the other day. I wrote 35 programs in January, and 28 or 29 programs in February. These are small programs, but I have a compulsion. I love to write programs and put things into it. ..."
Sep 07, 2019 | archive.computerhistory.org

Dijkstra said he was proud to be a programmer. Unfortunately he changed his attitude completely, and I think he wrote his last computer program in the 1980s. At this conference I went to in 1967 about simulation language, Chris Strachey was going around asking everybody at the conference what was the last computer program you wrote. This was 1967. Some of the people said, "I've never written a computer program." Others would say, "Oh yeah, here's what I did last week." I asked Edsger this question when I visited him in Texas in the 90s and he said, "Don, I write programs now with pencil and paper, and I execute them in my head." He finds that a good enough discipline.

I think he was mistaken on that. He taught me a lot of things, but I really think that if he had continued... One of Dijkstra's greatest strengths was that he felt a strong sense of aesthetics, and he didn't want to compromise his notions of beauty. They were so intense that when he visited me in the 1960s, I had just come to Stanford. I remember the conversation we had. It was in the first apartment, our little rented house, before we had electricity in the house.

We were sitting there in the dark, and he was telling me how he had just learned about the specifications of the IBM System/360, and it made him so ill that his heart was actually starting to flutter.

He intensely disliked things that he didn't consider clean to work with. So I can see that he would have distaste for the languages that he had to work with on real computers. My reaction to that was to design my own language, and then make Pascal so that it would work well for me in those days. But his response was to do everything only intellectually.

So, programming.

I happened to look the other day. I wrote 35 programs in January, and 28 or 29 programs in February. These are small programs, but I have a compulsion. I love to write programs and put things into it. I think of a question that I want to answer, or I have part of my book where I want to present something. But I can't just present it by reading about it in a book. As I code it, it all becomes clear in my head. It's just the discipline. The fact that I have to translate my knowledge of this method into something that the machine is going to understand just forces me to make that crystal-clear in my head. Then I can explain it to somebody else infinitely better. The exposition is always better if I've implemented it, even though it's going to take me more time.

[Sep 07, 2019] Knuth about computer science and money: At that point I made the decision in my life that I wasn't going to optimize my income;

Sep 07, 2019 | archive.computerhistory.org

So I had a programming hat when I was outside of Cal Tech, and at Cal Tech I am a mathematician taking my grad studies. A startup company, called Green Tree Corporation because green is the color of money, came to me and said, "Don, name your price. Write compilers for us and we will take care of finding computers for you to debug them on, and assistance for you to do your work. Name your price." I said, "Oh, okay. $100,000.", assuming that this was In that era this was not quite at Bill Gate's level today, but it was sort of out there.

The guy didn't blink. He said, "Okay." I didn't really blink either. I said, "Well, I'm not going to do it. I just thought this was an impossible number."

At that point I made the decision in my life that I wasn't going to optimize my income; I was really going to do what I thought I could do for well, I don't know. If you ask me what makes me most happy, number one would be somebody saying "I learned something from you". Number two would be somebody saying "I used your software". But number infinity would be Well, no. Number infinity minus one would be "I bought your book". It's not as good as "I read your book", you know. Then there is "I bought your software"; that was not in my own personal value. So that decision came up. I kept up with the literature about compilers. The Communications of the ACM was where the action was. I also worked with people on trying to debug the ALGOL language, which had problems with it. I published a few papers, like "The Remaining Trouble Spots in ALGOL 60" was one of the papers that I worked on. I chaired a committee called "Smallgol" which was to find a subset of ALGOL that would work on small computers. I was active in programming languages.

[Sep 07, 2019] Knuth: maybe 1 in 50 people have the "computer scientist's" type of intellect

Sep 07, 2019 | conservancy.umn.edu

Frana: You have made the comment several times that maybe 1 in 50 people have the "computer scientist's mind." Knuth: Yes. Frana: I am wondering if a large number of those people are trained professional librarians? [laughter] There is some strangeness there. But can you pinpoint what it is about the mind of the computer scientist that is....

Knuth: That is different?

Frana: What are the characteristics?

Knuth: Two things: one is the ability to deal with non-uniform structure, where you have case one, case two, case three, case four. Or that you have a model of something where the first component is integer, the next component is a Boolean, and the next component is a real number, or something like that, you know, non-uniform structure. To deal fluently with those kinds of entities, which is not typical in other branches of mathematics, is critical. And the other characteristic ability is to shift levels quickly, from looking at something in the large to looking at something in the small, and many levels in between, jumping from one level of abstraction to another. You know that, when you are adding one to some number, that you are actually getting closer to some overarching goal. These skills, being able to deal with nonuniform objects and to see through things from the top level to the bottom level, these are very essential to computer programming, it seems to me. But maybe I am fooling myself because I am too close to it.

Frana: It is the hardest thing to really understand that which you are existing within.

Knuth: Yes.

[Sep 07, 2019] Knuth: I can be a writer, who tries to organize other people's ideas into some kind of a more coherent structure so that it is easier to put things together

Sep 07, 2019 | conservancy.umn.edu

Knuth: I can be a writer, who tries to organize other people's ideas into some kind of a more coherent structure so that it is easier to put things together. I can see that I could be viewed as a scholar that does his best to check out sources of material, so that people get credit where it is due. And to check facts over, not just to look at the abstract of something, but to see what the methods were that did it and to fill in holes if necessary. I look at my role as being able to understand the motivations and terminology of one group of specialists and boil it down to a certain extent so that people in other parts of the field can use it. I try to listen to the theoreticians and select what they have done that is important to the programmer on the street; to remove technical jargon when possible.

But I have never been good at any kind of a role that would be making policy, or advising people on strategies, or what to do. I have always been best at refining things that are there and bringing order out of chaos. I sometimes raise new ideas that might stimulate people, but not really in a way that would be in any way controlling the flow. The only time I have ever advocated something strongly was with literate programming; but I do this always with the caveat that it works for me, not knowing if it would work for anybody else.

When I work with a system that I have created myself, I can always change it if I don't like it. But everybody who works with my system has to work with what I give them. So I am not able to judge my own stuff impartially. So anyway, I have always felt bad about if anyone says, 'Don, please forecast the future,'...

[Sep 06, 2019] Knuth: Programming and architecture are interrelated and it is impossible to create good architecure wthout actually programming at least of a prototype

Notable quotes:
"... When you're writing a document for a human being to understand, the human being will look at it and nod his head and say, "Yeah, this makes sense." But then there's all kinds of ambiguities and vagueness that you don't realize until you try to put it into a computer. Then all of a sudden, almost every five minutes as you're writing the code, a question comes up that wasn't addressed in the specification. "What if this combination occurs?" ..."
"... When you're faced with implementation, a person who has been delegated this job of working from a design would have to say, "Well hmm, I don't know what the designer meant by this." ..."
Sep 06, 2019 | archive.computerhistory.org

...I showed the second version of this design to two of my graduate students, and I said, "Okay, implement this, please, this summer. That's your summer job." I thought I had specified a language. I had to go away. I spent several weeks in China during the summer of 1977, and I had various other obligations. I assumed that when I got back from my summer trips, I would be able to play around with TeX and refine it a little bit. To my amazement, the students, who were outstanding students, had not competed [it]. They had a system that was able to do about three lines of TeX. I thought, "My goodness, what's going on? I thought these were good students." Well afterwards I changed my attitude to saying, "Boy, they accomplished a miracle."

Because going from my specification, which I thought was complete, they really had an impossible task, and they had succeeded wonderfully with it. These students, by the way, [were] Michael Plass, who has gone on to be the brains behind almost all of Xerox's Docutech software and all kind of things that are inside of typesetting devices now, and Frank Liang, one of the key people for Microsoft Word.

He did important mathematical things as well as his hyphenation methods which are quite used in all languages now. These guys were actually doing great work, but I was amazed that they couldn't do what I thought was just sort of a routine task. Then I became a programmer in earnest, where I had to do it. The reason is when you're doing programming, you have to explain something to a computer, which is dumb.

When you're writing a document for a human being to understand, the human being will look at it and nod his head and say, "Yeah, this makes sense." But then there's all kinds of ambiguities and vagueness that you don't realize until you try to put it into a computer. Then all of a sudden, almost every five minutes as you're writing the code, a question comes up that wasn't addressed in the specification. "What if this combination occurs?"

It just didn't occur to the person writing the design specification. When you're faced with implementation, a person who has been delegated this job of working from a design would have to say, "Well hmm, I don't know what the designer meant by this."

If I hadn't been in China they would've scheduled an appointment with me and stopped their programming for a day. Then they would come in at the designated hour and we would talk. They would take 15 minutes to present to me what the problem was, and then I would think about it for a while, and then I'd say, "Oh yeah, do this. " Then they would go home and they would write code for another five minutes and they'd have to schedule another appointment.

I'm probably exaggerating, but this is why I think Bob Floyd's Chiron compiler never got going. Bob worked many years on a beautiful idea for a programming language, where he designed a language called Chiron, but he never touched the programming himself. I think this was actually the reason that he had trouble with that project, because it's so hard to do the design unless you're faced with the low-level aspects of it, explaining it to a machine instead of to another person.

Forsythe, I think it was, who said, "People have said traditionally that you don't understand something until you've taught it in a class. The truth is you don't really understand something until you've taught it to a computer, until you've been able to program it." At this level, programming was absolutely important

[Sep 06, 2019] Knuth: No, I stopped going to conferences. It was too discouraging. Computer programming keeps getting harder because more stuff is discovered

Sep 06, 2019 | conservancy.umn.edu

Knuth: No, I stopped going to conferences. It was too discouraging. Computer programming keeps getting harder because more stuff is discovered. I can cope with learning about one new technique per day, but I can't take ten in a day all at once. So conferences are depressing; it means I have so much more work to do. If I hide myself from the truth I am much happier.

[Sep 06, 2019] How TAOCP was hatched

Notable quotes:
"... Also, Addison-Wesley was the people who were asking me to do this book; my favorite textbooks had been published by Addison Wesley. They had done the books that I loved the most as a student. For them to come to me and say, "Would you write a book for us?", and here I am just a secondyear gradate student -- this was a thrill. ..."
"... But in those days, The Art of Computer Programming was very important because I'm thinking of the aesthetical: the whole question of writing programs as something that has artistic aspects in all senses of the word. The one idea is "art" which means artificial, and the other "art" means fine art. All these are long stories, but I've got to cover it fairly quickly. ..."
Sep 06, 2019 | archive.computerhistory.org

Knuth: This is, of course, really the story of my life, because I hope to live long enough to finish it. But I may not, because it's turned out to be such a huge project. I got married in the summer of 1961, after my first year of graduate school. My wife finished college, and I could use the money I had made -- the $5000 on the compiler -- to finance a trip to Europe for our honeymoon.

We had four months of wedded bliss in Southern California, and then a man from Addison-Wesley came to visit me and said "Don, we would like you to write a book about how to write compilers."

The more I thought about it, I decided "Oh yes, I've got this book inside of me."

I sketched out that day -- I still have the sheet of tablet paper on which I wrote -- I sketched out 12 chapters that I thought ought to be in such a book. I told Jill, my wife, "I think I'm going to write a book."

As I say, we had four months of bliss, because the rest of our marriage has all been devoted to this book. Well, we still have had happiness. But really, I wake up every morning and I still haven't finished the book. So I try to -- I have to -- organize the rest of my life around this, as one main unifying theme. The book was supposed to be about how to write a compiler. They had heard about me from one of their editorial advisors, that I knew something about how to do this. The idea appealed to me for two main reasons. One is that I did enjoy writing. In high school I had been editor of the weekly paper. In college I was editor of the science magazine, and I worked on the campus paper as copy editor. And, as I told you, I wrote the manual for that compiler that we wrote. I enjoyed writing, number one.

Also, Addison-Wesley was the people who were asking me to do this book; my favorite textbooks had been published by Addison Wesley. They had done the books that I loved the most as a student. For them to come to me and say, "Would you write a book for us?", and here I am just a secondyear gradate student -- this was a thrill.

Another very important reason at the time was that I knew that there was a great need for a book about compilers, because there were a lot of people who even in 1962 -- this was January of 1962 -- were starting to rediscover the wheel. The knowledge was out there, but it hadn't been explained. The people who had discovered it, though, were scattered all over the world and they didn't know of each other's work either, very much. I had been following it. Everybody I could think of who could write a book about compilers, as far as I could see, they would only give a piece of the fabric. They would slant it to their own view of it. There might be four people who could write about it, but they would write four different books. I could present all four of their viewpoints in what I would think was a balanced way, without any axe to grind, without slanting it towards something that I thought would be misleading to the compiler writer for the future. I considered myself as a journalist, essentially. I could be the expositor, the tech writer, that could do the job that was needed in order to take the work of these brilliant people and make it accessible to the world. That was my motivation. Now, I didn't have much time to spend on it then, I just had this page of paper with 12 chapter headings on it. That's all I could do while I'm a consultant at Burroughs and doing my graduate work. I signed a contract, but they said "We know it'll take you a while." I didn't really begin to have much time to work on it until 1963, my third year of graduate school, as I'm already finishing up on my thesis. In the summer of '62, I guess I should mention, I wrote another compiler. This was for Univac; it was a FORTRAN compiler. I spent the summer, I sold my soul to the devil, I guess you say, for three months in the summer of 1962 to write a FORTRAN compiler. I believe that the salary for that was $15,000, which was much more than an assistant professor. I think assistant professors were getting eight or nine thousand in those days.

Feigenbaum: Well, when I started in 1960 at [University of California] Berkeley, I was getting $7,600 for the nine-month year.

Knuth: Knuth: Yeah, so you see it. I got $15,000 for a summer job in 1962 writing a FORTRAN compiler. One day during that summer I was writing the part of the compiler that looks up identifiers in a hash table. The method that we used is called linear probing. Basically you take the variable name that you want to look up, you scramble it, like you square it or something like this, and that gives you a number between one and, well in those days it would have been between 1 and 1000, and then you look there. If you find it, good; if you don't find it, go to the next place and keep on going until you either get to an empty place, or you find the number you're looking for. It's called linear probing. There was a rumor that one of Professor Feller's students at Princeton had tried to figure out how fast linear probing works and was unable to succeed. This was a new thing for me. It was a case where I was doing programming, but I also had a mathematical problem that would go into my other [job]. My winter job was being a math student, my summer job was writing compilers. There was no mix. These worlds did not intersect at all in my life at that point. So I spent one day during the summer while writing the compiler looking at the mathematics of how fast does linear probing work. I got lucky, and I solved the problem. I figured out some math, and I kept two or three sheets of paper with me and I typed it up. ["Notes on 'Open' Addressing', 7/22/63] I guess that's on the internet now, because this became really the genesis of my main research work, which developed not to be working on compilers, but to be working on what they call analysis of algorithms, which is, have a computer method and find out how good is it quantitatively. I can say, if I got so many things to look up in the table, how long is linear probing going to take. It dawned on me that this was just one of many algorithms that would be important, and each one would lead to a fascinating mathematical problem. This was easily a good lifetime source of rich problems to work on. Here I am then, in the middle of 1962, writing this FORTRAN compiler, and I had one day to do the research and mathematics that changed my life for my future research trends. But now I've gotten off the topic of what your original question was.

Feigenbaum: We were talking about sort of the.. You talked about the embryo of The Art of Computing. The compiler book morphed into The Art of Computer Programming, which became a seven-volume plan.

Knuth: Exactly. Anyway, I'm working on a compiler and I'm thinking about this. But now I'm starting, after I finish this summer job, then I began to do things that were going to be relating to the book. One of the things I knew I had to have in the book was an artificial machine, because I'm writing a compiler book but machines are changing faster than I can write books. I have to have a machine that I'm totally in control of. I invented this machine called MIX, which was typical of the computers of 1962.

In 1963 I wrote a simulator for MIX so that I could write sample programs for it, and I taught a class at Caltech on how to write programs in assembly language for this hypothetical computer. Then I started writing the parts that dealt with sorting problems and searching problems, like the linear probing idea. I began to write those parts, which are part of a compiler, of the book. I had several hundred pages of notes gathering for those chapters for The Art of Computer Programming. Before I graduated, I've already done quite a bit of writing on The Art of Computer Programming.

I met George Forsythe about this time. George was the man who inspired both of us [Knuth and Feigenbaum] to come to Stanford during the '60s. George came down to Southern California for a talk, and he said, "Come up to Stanford. How about joining our faculty?" I said "Oh no, I can't do that. I just got married, and I've got to finish this book first." I said, "I think I'll finish the book next year, and then I can come up [and] start thinking about the rest of my life, but I want to get my book done before my son is born." Well, John is now 40-some years old and I'm not done with the book. Part of my lack of expertise is any good estimation procedure as to how long projects are going to take. I way underestimated how much needed to be written about in this book. Anyway, I started writing the manuscript, and I went merrily along writing pages of things that I thought really needed to be said. Of course, it didn't take long before I had started to discover a few things of my own that weren't in any of the existing literature. I did have an axe to grind. The message that I was presenting was in fact not going to be unbiased at all. It was going to be based on my own particular slant on stuff, and that original reason for why I should write the book became impossible to sustain. But the fact that I had worked on linear probing and solved the problem gave me a new unifying theme for the book. I was going to base it around this idea of analyzing algorithms, and have some quantitative ideas about how good methods were. Not just that they worked, but that they worked well: this method worked 3 times better than this method, or 3.1 times better than this method. Also, at this time I was learning mathematical techniques that I had never been taught in school. I found they were out there, but they just hadn't been emphasized openly, about how to solve problems of this kind.

So my book would also present a different kind of mathematics than was common in the curriculum at the time, that was very relevant to analysis of algorithm. I went to the publishers, I went to Addison Wesley, and said "How about changing the title of the book from 'The Art of Computer Programming' to 'The Analysis of Algorithms'." They said that will never sell; their focus group couldn't buy that one. I'm glad they stuck to the original title, although I'm also glad to see that several books have now come out called "The Analysis of Algorithms", 20 years down the line.

But in those days, The Art of Computer Programming was very important because I'm thinking of the aesthetical: the whole question of writing programs as something that has artistic aspects in all senses of the word. The one idea is "art" which means artificial, and the other "art" means fine art. All these are long stories, but I've got to cover it fairly quickly.

I've got The Art of Computer Programming started out, and I'm working on my 12 chapters. I finish a rough draft of all 12 chapters by, I think it was like 1965. I've got 3,000 pages of notes, including a very good example of what you mentioned about seeing holes in the fabric. One of the most important chapters in the book is parsing: going from somebody's algebraic formula and figuring out the structure of the formula. Just the way I had done in seventh grade finding the structure of English sentences, I had to do this with mathematical sentences.

Chapter ten is all about parsing of context-free language, [which] is what we called it at the time. I covered what people had published about context-free languages and parsing. I got to the end of the chapter and I said, well, you can combine these ideas and these ideas, and all of a sudden you get a unifying thing which goes all the way to the limit. These other ideas had sort of gone partway there. They would say "Oh, if a grammar satisfies this condition, I can do it efficiently." "If a grammar satisfies this condition, I can do it efficiently." But now, all of a sudden, I saw there was a way to say I can find the most general condition that can be done efficiently without looking ahead to the end of the sentence. That you could make a decision on the fly, reading from left to right, about the structure of the thing. That was just a natural outgrowth of seeing the different pieces of the fabric that other people had put together, and writing it into a chapter for the first time. But I felt that this general concept, well, I didn't feel that I had surrounded the concept. I knew that I had it, and I could prove it, and I could check it, but I couldn't really intuit it all in my head. I knew it was right, but it was too hard for me, really, to explain it well.

So I didn't put in The Art of Computer Programming. I thought it was beyond the scope of my book. Textbooks don't have to cover everything when you get to the harder things; then you have to go to the literature. My idea at that time [is] I'm writing this book and I'm thinking it's going to be published very soon, so any little things I discover and put in the book I didn't bother to write a paper and publish in the journal because I figure it'll be in my book pretty soon anyway. Computer science is changing so fast, my book is bound to be obsolete.

It takes a year for it to go through editing, and people drawing the illustrations, and then they have to print it and bind it and so on. I have to be a little bit ahead of the state-of-the-art if my book isn't going to be obsolete when it comes out. So I kept most of the stuff to myself that I had, these little ideas I had been coming up with. But when I got to this idea of left-to-right parsing, I said "Well here's something I don't really understand very well. I'll publish this, let other people figure out what it is, and then they can tell me what I should have said." I published that paper I believe in 1965, at the end of finishing my draft of the chapter, which didn't get as far as that story, LR(k). Well now, textbooks of computer science start with LR(k) and take off from there. But I want to give you an idea of

[Sep 06, 2019] Python vs. Ruby Which is best for web development Opensource.com

Sep 06, 2019 | opensource.com

Python was developed organically in the scientific space as a prototyping language that easily could be translated into C++ if a prototype worked. This happened long before it was first used for web development. Ruby, on the other hand, became a major player specifically because of web development; the Rails framework extended Ruby's popularity with people developing complex websites.

Which programming language best suits your needs? Here is a quick overview of each language to help you choose:

Approach: one best way vs. human-language Python

Python takes a direct approach to programming. Its main goal is to make everything obvious to the programmer. In Python, there is only one "best" way to do something. This philosophy has led to a language strict in layout.

Python's core philosophy consists of three key hierarchical principles:

This regimented philosophy results in Python being eminently readable and easy to learn -- and why Python is great for beginning coders. Python has a big foothold in introductory programming courses . Its syntax is very simple, with little to remember. Because its code structure is explicit, the developer can easily tell where everything comes from, making it relatively easy to debug.

Python's hierarchy of principles is evident in many aspects of the language. Its use of whitespace to do flow control as a core part of the language syntax differs from most other languages, including Ruby. The way you indent code determines the meaning of its action. This use of whitespace is a prime example of Python's "explicit" philosophy, the shape a Python app takes spells out its logic and how the app will act.

Ruby

In contrast to Python, Ruby focuses on "human-language" programming, and its code reads like a verbal language rather than a machine-based one, which many programmers, both beginners and experts, like. Ruby follows the principle of " least astonishment ," and offers myriad ways to do the same thing. These similar methods can have multiple names, which many developers find confusing and frustrating.

Unlike Python, Ruby makes use of "blocks," a first-class object that is treated as a unit within a program. In fact, Ruby takes the concept of OOP (Object-Oriented Programming) to its limit. Everything is an object -- even global variables are actually represented within the ObjectSpace object. Classes and modules are themselves objects, and functions and operators are methods of objects. This ability makes Ruby especially powerful, especially when combined with its other primary strength: functional programming and the use of lambdas.

In addition to blocks and functional programming, Ruby provides programmers with many other features, including fragmentation, hashable and unhashable types, and mutable strings.

Ruby's fans find its elegance to be one of its top selling points. At the same time, Ruby's "magical" features and flexibility can make it very hard to track down bugs.

Communities: stability vs. innovation

Although features and coding philosophy are the primary drivers for choosing a given language, the strength of a developer community also plays an important role. Fortunately, both Python and Ruby boast strong communities.

Python

Python's community already includes a large Linux and academic community and therefore offers many academic use cases in both math and science. That support gives the community a stability and diversity that only grows as Python increasingly is used for web development.

Ruby

However, Ruby's community has focused primarily on web development from the get-go. It tends to innovate more quickly than the Python community, but this innovation also causes more things to break. In addition, while it has gotten more diverse, it has yet to reach the level of diversity that Python has.

Final thoughts

For web development, Ruby has Rails and Python has Django. Both are powerful frameworks, so when it comes to web development, you can't go wrong with either language. Your decision will ultimately come down to your level of experience and your philosophical preferences.

If you plan to focus on building web applications, Ruby is popular and flexible. There is a very strong community built upon it and they are always on the bleeding edge of development.

If you are interested in building web applications and would like to learn a language that's used more generally, try Python. You'll get a diverse community and lots of influence and support from the various industries in which it is used.

Tom Radcliffe - Tom Radcliffe has over 20 years experience in software development and management in both academia and industry. He is a professional engineer (PEO and APEGBC) and holds a PhD in physics from Queen's University at Kingston. Tom brings a passion for quantitative, data-driven processes to ActiveState .

[Sep 03, 2019] bash - How to convert strings like 19-FEB-12 to epoch date in UNIX - Stack Overflow

Feb 11, 2013 | stackoverflow.com

Asked 6 years, 6 months ago Active 2 years, 2 months ago Viewed 53k times 24 4

hellish ,Feb 11, 2013 at 3:45

In UNIX how to convert to epoch milliseconds date strings like:
19-FEB-12
16-FEB-12
05-AUG-09

I need this to compare these dates with the current time on the server.

> ,

To convert a date to seconds since the epoch:
date --date="19-FEB-12" +%s

Current epoch:

date +%s

So, since your dates are in the past:

NOW=`date +%s`
THEN=`date --date="19-FEB-12" +%s`

let DIFF=$NOW-$THEN
echo "The difference is: $DIFF"

Using BSD's date command, you would need

$ date -j -f "%d-%B-%y" 19-FEB-12 +%s

Differences from GNU date :

  1. -j prevents date from trying to set the clock
  2. The input format must be explicitly set with -f
  3. The input date is a regular argument, not an option (viz. -d )
  4. When no time is specified with the date, use the current time instead of midnight.

[Sep 03, 2019] command line - How do I convert an epoch timestamp to a human readable format on the cli - Unix Linux Stack Exchange

Sep 03, 2019 | unix.stackexchange.com

Gilles ,Oct 11, 2010 at 18:14

date -d @1190000000 Replace 1190000000 with your epoch

Stefan Lasiewski ,Oct 11, 2010 at 18:04

$ echo 1190000000 | perl -pe 's/(\d+)/localtime($1)/e' 
Sun Sep 16 20:33:20 2007

This can come in handy for those applications which use epoch time in the logfiles:

$ tail -f /var/log/nagios/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
[Thu May 13 10:15:46 2010] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;HOSTA;check_raid;0;check_raid.pl: OK (Unit 0 on Controller 0 is OK)

Stéphane Chazelas ,Jul 31, 2015 at 20:24

With bash-4.2 or above:
printf '%(%F %T)T\n' 1234567890

(where %F %T is the strftime() -type format)

That syntax is inspired from ksh93 .

In ksh93 however, the argument is taken as a date expression where various and hardly documented formats are supported.

For a Unix epoch time, the syntax in ksh93 is:

printf '%(%F %T)T\n' '#1234567890'

ksh93 however seems to use its own algorithm for the timezone and can get it wrong. For instance, in Britain, it was summer time all year in 1970, but:

$ TZ=Europe/London bash -c 'printf "%(%c)T\n" 0'
Thu 01 Jan 1970 01:00:00 BST
$ TZ=Europe/London ksh93 -c 'printf "%(%c)T\n" "#0"'
Thu Jan  1 00:00:00 1970

DarkHeart ,Jul 28, 2014 at 3:56

Custom format with GNU date :
date -d @1234567890 +'%Y-%m-%d %H:%M:%S'

Or with GNU awk :

awk 'BEGIN { print strftime("%Y-%m-%d %H:%M:%S", 1234567890); }'

Linked SO question: https://stackoverflow.com/questions/3249827/convert-from-unixtime-at-command-line

,

The two I frequently use are:
$ perl -leprint\ scalar\ localtime\ 1234567890
Sat Feb 14 00:31:30 2009

[Sep 02, 2019] bash - Pretty-print for shell script

Oct 21, 2010 | stackoverflow.com

Pretty-print for shell script Ask Question Asked 8 years, 10 months ago Active 30 days ago Viewed 14k times


Benoit ,Oct 21, 2010 at 13:19

I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc.

Do you know of one ?

Jamie ,Sep 11, 2012 at 3:00

Vim can indent bash scripts. But not reformat them before indenting.
Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!)

Though, some bugs with << (expecting EOF as first character on a line) e.g.

EDIT: ZZ not ZQ

Daniel Martí ,Apr 8, 2018 at 13:52

A bit late to the party, but it looks like shfmt could do the trick for you.

Brian Chrisman ,Aug 11 at 4:08

In bash I do this:
reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//"
}

this eliminates comments and reindents the script "bash way".

If you have HEREDOCS in your script, they got ruined by the sed in the previous function.

So use:

reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3"
}

But all your script will have a 4 spaces indentation.

Or you can do:

reindent () 
{ 
    rstr=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 16 | head -n 1);
    source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}");
    echo '#!/bin/bash';
    declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/    /"
}

which takes care also of heredocs.

Pius Raeder ,Jan 10, 2017 at 8:35

Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script .

Very nice, only one thing i took out is the [...]->test substitution.

[Sep 02, 2019] Negative regex for Perl string pattern match

Sep 02, 2019 | stackoverflow.com

mirod ,Jun 15, 2011 at 17:21

I have this regex:
if($string =~ m/^(Clinton|[^Bush]|Reagan)/i)
  {print "$string\n"};

I want to match with Clinton and Reagan, but not Bush.

It's not working.

Calvin Taylor ,Jul 14, 2017 at 21:03

Sample text:

Clinton said
Bush used crayons
Reagan forgot

Just omitting a Bush match:

$ perl -ne 'print if /^(Clinton|Reagan)/' textfile
Clinton said
Reagan forgot

Or if you really want to specify:

$ perl -ne 'print if /^(?!Bush)(Clinton|Reagan)/' textfile
Clinton said
Reagan forgot

GuruM ,Oct 27, 2012 at 12:54

Your regex does not work because [] defines a character class, but what you want is a lookahead:
(?=) - Positive look ahead assertion foo(?=bar) matches foo when followed by bar
(?!) - Negative look ahead assertion foo(?!bar) matches foo when not followed by bar
(?<=) - Positive look behind assertion (?<=foo)bar matches bar when preceded by foo
(?<!) - Negative look behind assertion (?<!foo)bar matches bar when NOT preceded by foo
(?>) - Once-only subpatterns (?>\d+)bar Performance enhancing when bar not present
(?(x)) - Conditional subpatterns
(?(3)foo|fu)bar - Matches foo if 3rd subpattern has matched, fu if not
(?#) - Comment (?# Pattern does x y or z)

So try: (?!bush)

[Sep 02, 2019] How to get the current line number of a file open using Perl

Sep 02, 2019 | stackoverflow.com

How to get the current line number of a file open using Perl? Ask Question Asked 8 years, 3 months ago Active 6 months ago Viewed 33k times 25 1


tadmc ,May 8, 2011 at 17:08

open my $fp, '<', $file or die $!;

while (<$fp>) {
    my $line = $_;
    if ($line =~ /$regex/) {
        # How do I find out which line number this match happened at?
    }
}

close $fp;

tchrist ,Apr 22, 2015 at 21:16

Use $. (see perldoc perlvar ).

tchrist ,May 7, 2011 at 16:48

You can also do it through OO interface:
use IO::Handle;
# later on ...
my $n = $fp->input_line_number();

This is in perldoc perlvar , too.

> ,

Don't use $. , nor $_ or any global variable. Use this instead:
while(my $line = <FILE>) {
  print $line unless ${\*FILE}->input_line_number == 1;
}

To avoid this and a lot of others Perl gotchas you can use on Atom or VSCode packages like linter-perl . Stop making Perl a write-only language !

[Aug 31, 2019] Complexity prevent programmer from ever learning the whole language, only subset is learned and used

Aug 31, 2019 | ask.slashdot.org

Re:Neither! (Score 2, Interesting) 817 by M. D. Nahas on Friday December 23, 2005 @06:08PM ( #14329127 ) Attached to: Learning Java or C# as a Next Language? The cleanest languages I've used are C, Java, and OCaml. By "clean", I mean the language has a few concepts that can be completely memorized, which results in less "gotchas" and manual reading. For these languages, you'll see small manuals (e.g., K&R's book for C) which cover the complete language and then lots of pages devoted to the libraries that come with the language. I'd definitely recommend Java (or C, or OCaml) over C# for this reason. C# seems to have combined every feature of C++, Java, and VBA into a single language. It is very complex and has a ton of concepts, for which I could never memorize the whole language. I have a feeling that most programmers will use the subset of C# that is closest to the language they understand, whether it is C++, Java or VBA. You might as well learn Java's style of programming, and then, if needed, switch to C# using its Java-like features.

[Aug 29, 2019] How do I parse command line arguments in Bash - Stack Overflow

Jul 10, 2017 | stackoverflow.com

Livven, Jul 10, 2017 at 8:11

Update: It's been more than 5 years since I started this answer. Thank you for LOTS of great edits/comments/suggestions. In order save maintenance time, I've modified the code block to be 100% copy-paste ready. Please do not post comments like "What if you changed X to Y ". Instead, copy-paste the code block, see the output, make the change, rerun the script, and comment "I changed X to Y and " I don't have time to test your ideas and tell you if they work.
Method #1: Using bash without getopt[s]

Two common ways to pass key-value-pair arguments are:

Bash Space-Separated (e.g., --option argument ) (without getopt[s])

Usage demo-space-separated.sh -e conf -s /etc -l /usr/lib /etc/hosts

cat >/tmp/demo-space-separated.sh <<'EOF'
#!/bin/bash

POSITIONAL=()
while [[ $# -gt 0 ]]
do
key="$1"

case $key in
    -e|--extension)
    EXTENSION="$2"
    shift # past argument
    shift # past value
    ;;
    -s|--searchpath)
    SEARCHPATH="$2"
    shift # past argument
    shift # past value
    ;;
    -l|--lib)
    LIBPATH="$2"
    shift # past argument
    shift # past value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument
    ;;
    *)    # unknown option
    POSITIONAL+=("$1") # save it in an array for later
    shift # past argument
    ;;
esac
done
set -- "${POSITIONAL[@]}" # restore positional parameters

echo "FILE EXTENSION  = ${EXTENSION}"
echo "SEARCH PATH     = ${SEARCHPATH}"
echo "LIBRARY PATH    = ${LIBPATH}"
echo "DEFAULT         = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 "$1"
fi
EOF

chmod +x /tmp/demo-space-separated.sh

/tmp/demo-space-separated.sh -e conf -s /etc -l /usr/lib /etc/hosts

output from copy-pasting the block above:

FILE EXTENSION  = conf
SEARCH PATH     = /etc
LIBRARY PATH    = /usr/lib
DEFAULT         =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34    example.com
Bash Equals-Separated (e.g., --option=argument ) (without getopt[s])

Usage demo-equals-separated.sh -e=conf -s=/etc -l=/usr/lib /etc/hosts

cat >/tmp/demo-equals-separated.sh <<'EOF'
#!/bin/bash

for i in "$@"
do
case $i in
    -e=*|--extension=*)
    EXTENSION="${i#*=}"
    shift # past argument=value
    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    shift # past argument=value
    ;;
    -l=*|--lib=*)
    LIBPATH="${i#*=}"
    shift # past argument=value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument with no value
    ;;
    *)
          # unknown option
    ;;
esac
done
echo "FILE EXTENSION  = ${EXTENSION}"
echo "SEARCH PATH     = ${SEARCHPATH}"
echo "LIBRARY PATH    = ${LIBPATH}"
echo "DEFAULT         = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 $1
fi
EOF

chmod +x /tmp/demo-equals-separated.sh

/tmp/demo-equals-separated.sh -e=conf -s=/etc -l=/usr/lib /etc/hosts

output from copy-pasting the block above:

FILE EXTENSION  = conf
SEARCH PATH     = /etc
LIBRARY PATH    = /usr/lib
DEFAULT         =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34    example.com

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Method #2: Using bash with getopt[s]

from: http://mywiki.wooledge.org/BashFAQ/035#getopts

getopt(1) limitations (older, relatively-recent getopt versions):

More recent getopt versions don't have these limitations.

Additionally, the POSIX shell (and others) offer getopts which doesn't have these limitations. I've included a simplistic getopts example.

Usage demo-getopts.sh -vf /etc/hosts foo bar

cat >/tmp/demo-getopts.sh <<'EOF'
#!/bin/sh

# A POSIX variable
OPTIND=1         # Reset in case getopts has been used previously in the shell.

# Initialize our own variables:
output_file=""
verbose=0

while getopts "h?vf:" opt; do
    case "$opt" in
    h|\?)
        show_help
        exit 0
        ;;
    v)  verbose=1
        ;;
    f)  output_file=$OPTARG
        ;;
    esac
done

shift $((OPTIND-1))

[ "${1:-}" = "--" ] && shift

echo "verbose=$verbose, output_file='$output_file', Leftovers: $@"
EOF

chmod +x /tmp/demo-getopts.sh

/tmp/demo-getopts.sh -vf /etc/hosts foo bar

output from copy-pasting the block above:

verbose=1, output_file='/etc/hosts', Leftovers: foo bar

The advantages of getopts are:

  1. It's more portable, and will work in other shells like dash .
  2. It can handle multiple single options like -vf filename in the typical Unix way, automatically.

The disadvantage of getopts is that it can only handle short options ( -h , not --help ) without additional code.

There is a getopts tutorial which explains what all of the syntax and variables mean. In bash, there is also help getopts , which might be informative.

johncip ,Jul 23, 2018 at 15:15

No answer mentions enhanced getopt . And the top-voted answer is misleading: It either ignores -⁠vfd style short options (requested by the OP) or options after positional arguments (also requested by the OP); and it ignores parsing-errors. Instead:

The following calls

myscript -vfd ./foo/bar/someFile -o /fizz/someOtherFile
myscript -v -f -d -o/fizz/someOtherFile -- ./foo/bar/someFile
myscript --verbose --force --debug ./foo/bar/someFile -o/fizz/someOtherFile
myscript --output=/fizz/someOtherFile ./foo/bar/someFile -vfd
myscript ./foo/bar/someFile -df -v --output /fizz/someOtherFile

all return

verbose: y, force: y, debug: y, in: ./foo/bar/someFile, out: /fizz/someOtherFile

with the following myscript

#!/bin/bash
# saner programming env: these switches turn some bugs into errors
set -o errexit -o pipefail -o noclobber -o nounset

# -allow a command to fail with !'s side effect on errexit
# -use return value from ${PIPESTATUS[0]}, because ! hosed $?
! getopt --test > /dev/null 
if [[ ${PIPESTATUS[0]} -ne 4 ]]; then
    echo 'I'm sorry, `getopt --test` failed in this environment.'
    exit 1
fi

OPTIONS=dfo:v
LONGOPTS=debug,force,output:,verbose

# -regarding ! and PIPESTATUS see above
# -temporarily store output to be able to check for errors
# -activate quoting/enhanced mode (e.g. by writing out "--options")
# -pass arguments only via   -- "$@"   to separate them correctly
! PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTS --name "$0" -- "$@")
if [[ ${PIPESTATUS[0]} -ne 0 ]]; then
    # e.g. return value is 1
    #  then getopt has complained about wrong arguments to stdout
    exit 2
fi
# read getopt's output this way to handle the quoting right:
eval set -- "$PARSED"

d=n f=n v=n outFile=-
# now enjoy the options in order and nicely split until we see --
while true; do
    case "$1" in
        -d|--debug)
            d=y
            shift
            ;;
        -f|--force)
            f=y
            shift
            ;;
        -v|--verbose)
            v=y
            shift
            ;;
        -o|--output)
            outFile="$2"
            shift 2
            ;;
        --)
            shift
            break
            ;;
        *)
            echo "Programming error"
            exit 3
            ;;
    esac
done

# handle non-option arguments
if [[ $# -ne 1 ]]; then
    echo "$0: A single input file is required."
    exit 4
fi

echo "verbose: $v, force: $f, debug: $d, in: $1, out: $outFile"

1 enhanced getopt is available on most "bash-systems", including Cygwin; on OS X try brew install gnu-getopt or sudo port install getopt
2 the POSIX exec() conventions have no reliable way to pass binary NULL in command line arguments; those bytes prematurely end the argument
3 first version released in 1997 or before (I only tracked it back to 1997)

Tobias Kienzler ,Mar 19, 2016 at 15:23

from : digitalpeer.com with minor modifications

Usage myscript.sh -p=my_prefix -s=dirname -l=libname

#!/bin/bash
for i in "$@"
do
case $i in
    -p=*|--prefix=*)
    PREFIX="${i#*=}"

    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    ;;
    -l=*|--lib=*)
    DIR="${i#*=}"
    ;;
    --default)
    DEFAULT=YES
    ;;
    *)
            # unknown option
    ;;
esac
done
echo PREFIX = ${PREFIX}
echo SEARCH PATH = ${SEARCHPATH}
echo DIRS = ${DIR}
echo DEFAULT = ${DEFAULT}

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Robert Siemer ,Jun 1, 2018 at 1:57

getopt() / getopts() is a good option. Stolen from here :

The simple use of "getopt" is shown in this mini-script:

#!/bin/bash
echo "Before getopt"
for i
do
  echo $i
done
args=`getopt abc:d $*`
set -- $args
echo "After getopt"
for i
do
  echo "-->$i"
done

What we have said is that any of -a, -b, -c or -d will be allowed, but that -c is followed by an argument (the "c:" says that).

If we call this "g" and try it out:

bash-2.05a$ ./g -abc foo
Before getopt
-abc
foo
After getopt
-->-a
-->-b
-->-c
-->foo
-->--

We start with two arguments, and "getopt" breaks apart the options and puts each in its own argument. It also added "--".

hfossli ,Jan 31 at 20:05

More succinct way

script.sh

#!/bin/bash

while [[ "$#" -gt 0 ]]; do case $1 in
  -d|--deploy) deploy="$2"; shift;;
  -u|--uglify) uglify=1;;
  *) echo "Unknown parameter passed: $1"; exit 1;;
esac; shift; done

echo "Should deploy? $deploy"
echo "Should uglify? $uglify"

Usage:

./script.sh -d dev -u

# OR:

./script.sh --deploy dev --uglify

bronson ,Apr 27 at 23:22

At the risk of adding another example to ignore, here's my scheme.

Hope it's useful to someone.

while [ "$#" -gt 0 ]; do
  case "$1" in
    -n) name="$2"; shift 2;;
    -p) pidfile="$2"; shift 2;;
    -l) logfile="$2"; shift 2;;

    --name=*) name="${1#*=}"; shift 1;;
    --pidfile=*) pidfile="${1#*=}"; shift 1;;
    --logfile=*) logfile="${1#*=}"; shift 1;;
    --name|--pidfile|--logfile) echo "$1 requires an argument" >&2; exit 1;;

    -*) echo "unknown option: $1" >&2; exit 1;;
    *) handle_argument "$1"; shift 1;;
  esac
done

Robert Siemer ,Jun 6, 2016 at 19:28

I'm about 4 years late to this question, but want to give back. I used the earlier answers as a starting point to tidy up my old adhoc param parsing. I then refactored out the following template code. It handles both long and short params, using = or space separated arguments, as well as multiple short params grouped together. Finally it re-inserts any non-param arguments back into the $1,$2.. variables. I hope it's useful.
#!/usr/bin/env bash

# NOTICE: Uncomment if your script depends on bashisms.
#if [ -z "$BASH_VERSION" ]; then bash $0 $@ ; exit $? ; fi

echo "Before"
for i ; do echo - $i ; done


# Code template for parsing command line parameters using only portable shell
# code, while handling both long and short params, handling '-f file' and
# '-f=file' style param data and also capturing non-parameters to be inserted
# back into the shell positional parameters.

while [ -n "$1" ]; do
        # Copy so we can modify it (can't modify $1)
        OPT="$1"
        # Detect argument termination
        if [ x"$OPT" = x"--" ]; then
                shift
                for OPT ; do
                        REMAINS="$REMAINS \"$OPT\""
                done
                break
        fi
        # Parse current opt
        while [ x"$OPT" != x"-" ] ; do
                case "$OPT" in
                        # Handle --flag=value opts like this
                        -c=* | --config=* )
                                CONFIGFILE="${OPT#*=}"
                                shift
                                ;;
                        # and --flag value opts like this
                        -c* | --config )
                                CONFIGFILE="$2"
                                shift
                                ;;
                        -f* | --force )
                                FORCE=true
                                ;;
                        -r* | --retry )
                                RETRY=true
                                ;;
                        # Anything unknown is recorded for later
                        * )
                                REMAINS="$REMAINS \"$OPT\""
                                break
                                ;;
                esac
                # Check for multiple short options
                # NOTICE: be sure to update this pattern to match valid options
                NEXTOPT="${OPT#-[cfr]}" # try removing single short opt
                if [ x"$OPT" != x"$NEXTOPT" ] ; then
                        OPT="-$NEXTOPT"  # multiple short opts, keep going
                else
                        break  # long form, exit inner loop
                fi
        done
        # Done with that param. move to next
        shift
done
# Set the non-parameters back into the positional parameters ($1 $2 ..)
eval set -- $REMAINS


echo -e "After: \n configfile='$CONFIGFILE' \n force='$FORCE' \n retry='$RETRY' \n remains='$REMAINS'"
for i ; do echo - $i ; done

> ,

I have found the matter to write portable parsing in scripts so frustrating that I have written Argbash - a FOSS code generator that can generate the arguments-parsing code for your script plus it has some nice features:

https://argbash.io

[Aug 29, 2019] shell - An example of how to use getopts in bash - Stack Overflow

The key thing to understand is that getops is just parsing options. You need to shift them as a separate operation:
shift $((OPTIND-1))
May 10, 2013 | stackoverflow.com

An example of how to use getopts in bash Ask Question Asked 6 years, 3 months ago Active 10 months ago Viewed 419k times 288 132

chepner ,May 10, 2013 at 13:42

I want to call myscript file in this way:
$ ./myscript -s 45 -p any_string

or

$ ./myscript -h >>> should display help
$ ./myscript    >>> should display help

My requirements are:

I tried so far this code:

#!/bin/bash
while getopts "h:s:" arg; do
  case $arg in
    h)
      echo "usage" 
      ;;
    s)
      strength=$OPTARG
      echo $strength
      ;;
  esac
done

But with that code I get errors. How to do it with Bash and getopt ?

,

#!/bin/bash

usage() { echo "Usage: $0 [-s <45|90>] [-p <string>]" 1>&2; exit 1; }

while getopts ":s:p:" o; do
    case "${o}" in
        s)
            s=${OPTARG}
            ((s == 45 || s == 90)) || usage
            ;;
        p)
            p=${OPTARG}
            ;;
        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

if [ -z "${s}" ] || [ -z "${p}" ]; then
    usage
fi

echo "s = ${s}"
echo "p = ${p}"

Example runs:

$ ./myscript.sh
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -h
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -s "" -p ""
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -s 10 -p foo
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -s 45 -p foo
s = 45
p = foo

$ ./myscript.sh -s 90 -p bar
s = 90
p = bar

[Aug 28, 2019] How do I import a perl module outside of @INC that does not end in .pm - Stack Overflow

Aug 22, 2019 | stackoverflow.com


mob ,Aug 22 at 19:47

Background

I am attempting to import a perl module that does not end in .pm with a method similar to this answer :

use lib "/hard/coded/directory"; use scripting;

However, when I attempt to import a module in this way, I get the following error when running perl -c :

Can't locate scripting.pm in @INC (@INC contains: ... ... ... /hard/coded/directory) at name of script line 47.

BEGIN failed--compilation aborted at name of script line 47.

Question

How do I import a perl module outside of @INC that does not have .pm at the end of the file?

ikegami ,Aug 22 at 20:04

If the file has a package directive, the file name and the package directive need to match, so simply fix the file name.

If the file doesn't have a package directive, you don't have a module , and you shouldn't use use or require . This can cause problems.

What you have is sometimes called a library, and you should use do .

do('/hard/coded/directory/scripting')
   or die $@ || $!;

(For proper error checking, the file needs to result in a true value.)

That said, you are probably trying to do something really awful. I'm guessing you're either have a configuration file written in Perl or a poorly written module. Perl is not a suitable choice of language for a configuration file, and avoiding namespaces is just bad programming with no benefit.

ikegami ,Aug 22 at 20:10

If the source file does not define new namespaces or classes and you just want to read the function definitions or data from a file, Perl provides the do and require functions.
do "scripting";
require "scripting";

The difference between them is that require will look for the file to evaluate to a true value (it expects the last statement in the file to resolve to a non-zero, non-empty value), and will emit a fatal error if this does not happen. (You will often see naked 1; statements at the end of modules to satisfy this requirement).

If scripting really contains class code and you do need all the functionality that the use function provides, remember that

use Foo::Bar qw(stuff);

is just syntactic sugar for

BEGIN {
    $file = <find Foo/Bar.pm on @INC>;
    require "$file";
    Foo::Bar->import( qw(stuff) )
}

and suggests how you can workaround your inability to use use :

BEGIN {
    require "scripting";
    scripting->import()
}

In theory, the file scripting might define some other package and begin with a line like package Something::Else; . Then you would load the package in this module with

BEGIN {
    require "scripting";
    Something::Else->import();
}

[Aug 27, 2019] perl defensive programming (die, assert, croak) - Stack Overflow

Aug 27, 2019 | stackoverflow.com

perl defensive programming (die, assert, croak) Ask Question Asked 5 years, 6 months ago Active 5 years, 6 months ago Viewed 645 times 2 0


Zaid ,Feb 23, 2014 at 17:11

What is the best (or recommended) approach to do defensive programming in perl? For example if I have a sub which must be called with a (defined) SCALAR, an ARRAYREF and an optional HASHREF.

Three of the approaches I have seen:

sub test1 {
    die if !(@_ == 2 || @_ == 3);
    my ($scalar, $arrayref, $hashref) = @_;
    die if !defined($scalar) || ref($scalar);
    die if ref($arrayref) ne 'ARRAY';
    die if defined($hashref) && ref($hashref) ne 'HASH';
    #do s.th with scalar, arrayref and hashref
}

sub test2 {
    Carp::assert(@_ == 2 || @_ == 3) if DEBUG;
    my ($scalar, $arrayref, $hashref) = @_;
    if(DEBUG) {
        Carp::assert defined($scalar) && !ref($scalar);
        Carp::assert ref($arrayref) eq 'ARRAY';
        Carp::assert !defined($hashref) || ref($hashref) eq 'HASH';
    }
    #do s.th with scalar, arrayref and hashref
}

sub test3 {
    my ($scalar, $arrayref, $hashref) = @_;
    (@_ == 2 || @_ == 3 && defined($scalar) && !ref($scalar) && ref($arrayref) eq 'ARRAY' && (!defined($hashref) || ref($hashref) eq 'HASH'))
        or Carp::croak 'usage: test3(SCALAR, ARRAYREF, [HASHREF])';
    #do s.th with scalar, arrayref and hashref
}

tobyink ,Feb 23, 2014 at 21:44

use Params::Validate qw(:all);

sub Yada {
   my (...)=validate_pos(@_,{ type=>SCALAR },{ type=>ARRAYREF },{ type=>HASHREF,optional=>1 });
   ...
}

ikegami ,Feb 23, 2014 at 17:33

I wouldn't use any of them. Aside from not not accepting many array and hash references, the checks you used are almost always redundant.
>perl -we"use strict; sub { my ($x) = @_; my $y = $x->[0] }->( 'abc' )"
Can't use string ("abc") as an ARRAY ref nda"strict refs" in use at -e line 1.

>perl -we"use strict; sub { my ($x) = @_; my $y = $x->[0] }->( {} )"
Not an ARRAY reference at -e line 1.

The only advantage to checking is that you can use croak to show the caller in the error message.


Proper way to check if you have an reference to an array:

defined($x) && eval { @$x; 1 }

Proper way to check if you have an reference to a hash:

defined($x) && eval { %$x; 1 }

Borodin ,Feb 23, 2014 at 17:23

None of the options you show display any message to give a reason for the failure, which I think is paramount.

It is also preferable to use croak instead of die from within library subroutines, so that the error is reported from the point of view of the caller.

I would replace all occurrences of if ! with unless . The former is a C programmer's habit.

I suggest something like this

sub test1 {
    croak "Incorrect number of parameters" unless @_ == 2 or @_ == 3;
    my ($scalar, $arrayref, $hashref) = @_;
    croak "Invalid first parameter" unless $scalar and not ref $scalar;
    croak "Invalid second parameter" unless $arrayref eq 'ARRAY';
    croak "Invalid third parameter" if defined $hashref and ref $hashref ne 'HASH';

    # do s.th with scalar, arrayref and hashref
}

[Aug 27, 2019] linux - How to show line number when executing bash script - Stack Overflow

Aug 27, 2019 | stackoverflow.com

How to show line number when executing bash script Ask Question Asked 6 years, 1 month ago Active 1 year, 4 months ago Viewed 47k times 68 31


dspjm ,Jul 23, 2013 at 7:31

I have a test script which has a lot of commands and will generate lots of output, I use set -x or set -v and set -e , so the script would stop when error occurs. However, it's still rather difficult for me to locate which line did the execution stop in order to locate the problem. Is there a method which can output the line number of the script before each line is executed? Or output the line number before the command exhibition generated by set -x ? Or any method which can deal with my script line location problem would be a great help. Thanks.

Suvarna Pattayil ,Jul 28, 2017 at 17:25

You mention that you're already using -x . The variable PS4 denotes the value is the prompt printed before the command line is echoed when the -x option is set and defaults to : followed by space.

You can change PS4 to emit the LINENO (The line number in the script or shell function currently executing).

For example, if your script reads:

$ cat script
foo=10
echo ${foo}
echo $((2 + 2))

Executing it thus would print line numbers:

$ PS4='Line ${LINENO}: ' bash -x script
Line 1: foo=10
Line 2: echo 10
10
Line 3: echo 4
4

http://wiki.bash-hackers.org/scripting/debuggingtips gives the ultimate PS4 that would output everything you will possibly need for tracing:

export PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'

Deqing ,Jul 23, 2013 at 8:16

In Bash, $LINENO contains the line number where the script currently executing.

If you need to know the line number where the function was called, try $BASH_LINENO . Note that this variable is an array.

For example:

#!/bin/bash       

function log() {
    echo "LINENO: ${LINENO}"
    echo "BASH_LINENO: ${BASH_LINENO[*]}"
}

function foo() {
    log "$@"
}

foo "$@"

See here for details of Bash variables.

Eliran Malka ,Apr 25, 2017 at 10:14

Simple (but powerful) solution: Place echo around the code you think that causes the problem and move the echo line by line until the messages does not appear anymore on screen - because the script has stop because of an error before.

Even more powerful solution: Install bashdb the bash debugger and debug the script line by line

kklepper ,Apr 2, 2018 at 22:44

Workaround for shells without LINENO

In a fairly sophisticated script I wouldn't like to see all line numbers; rather I would like to be in control of the output.

Define a function

echo_line_no () {
    grep -n "$1" $0 |  sed "s/echo_line_no//" 
    # grep the line(s) containing input $1 with line numbers
    # replace the function name with nothing 
} # echo_line_no

Use it with quotes like

echo_line_no "this is a simple comment with a line number"

Output is

16   "this is a simple comment with a line number"

if the number of this line in the source file is 16.

This basically answers the question How to show line number when executing bash script for users of ash or other shells without LINENO .

Anything more to add?

Sure. Why do you need this? How do you work with this? What can you do with this? Is this simple approach really sufficient or useful? Why do you want to tinker with this at all?

Want to know more? Read reflections on debugging

[Aug 27, 2019] How do I get the filename and line number in Perl - Stack Overflow

Aug 27, 2019 | stackoverflow.com

How do I get the filename and line number in Perl? Ask Question Asked 8 years, 10 months ago Active 8 years, 9 months ago Viewed 6k times 6


Elijah ,Nov 1, 2010 at 17:35

I would like to get the current filename and line number within a Perl script. How do I do this?

For example, in a file call test.pl :

my $foo = 'bar';
print 'Hello World';
print functionForFilename() . ':' . functionForLineNo();

It would output:

Hello World
test.pl:3

tchrist ,Nov 2, 2010 at 19:13

These are available with the __LINE__ and __FILE__ tokens, as documented in perldoc perldata under "Special Literals":

The special literals __FILE__, __LINE__, and __PACKAGE__ represent the current filename, line number, and package name at that point in your program. They may be used only as separate tokens; they will not be interpolated into strings. If there is no current package (due to an empty package; directive), __PACKAGE__ is the undefined value.

Eric Strom ,Nov 1, 2010 at 17:41

The caller function will do what you are looking for:
sub print_info {
   my ($package, $filename, $line) = caller;
   ...
}

print_info(); # prints info about this line

This will get the information from where the sub is called, which is probably what you are looking for. The __FILE__ and __LINE__ directives only apply to where they are written, so you can not encapsulate their effect in a subroutine. (unless you wanted a sub that only prints info about where it is defined)

,

You can use:
print __FILE__. " " . __LINE__;

[Aug 26, 2019] bash - How to prevent rm from reporting that a file was not found

Aug 26, 2019 | stackoverflow.com

How to prevent rm from reporting that a file was not found? Ask Question Asked 7 years, 4 months ago Active 1 year, 4 months ago Viewed 101k times 133 19


pizza ,Apr 20, 2012 at 21:29

I am using rm within a BASH script to delete many files. Sometimes the files are not present, so it reports many errors. I do not need this message. I have searched the man page for a command to make rm quiet, but the only option I found is -f , which from the description, "ignore nonexistent files, never prompt", seems to be the right choice, but the name does not seem to fit, so I am concerned it might have unintended consequences.

Keith Thompson ,Dec 19, 2018 at 13:05

The main use of -f is to force the removal of files that would not be removed using rm by itself (as a special case, it "removes" non-existent files, thus suppressing the error message).

You can also just redirect the error message using

$ rm file.txt 2> /dev/null

(or your operating system's equivalent). You can check the value of $? immediately after calling rm to see if a file was actually removed or not.

vimdude ,May 28, 2014 at 18:10

Yes, -f is the most suitable option for this.

tripleee ,Jan 11 at 4:50

-f is the correct flag, but for the test operator, not rm
[ -f "$THEFILE" ] && rm "$THEFILE"

this ensures that the file exists and is a regular file (not a directory, device node etc...)

mahemoff ,Jan 11 at 4:41

\rm -f file will never report not found.

Idelic ,Apr 20, 2012 at 16:51

As far as rm -f doing "anything else", it does force ( -f is shorthand for --force ) silent removal in situations where rm would otherwise ask you for confirmation. For example, when trying to remove a file not writable by you from a directory that is writable by you.

Keith Thompson ,May 28, 2014 at 18:09

I had same issue for cshell. The only solution I had was to create a dummy file that matched pattern before "rm" in my script.

[Aug 26, 2019] shell - rm -rf return codes

Aug 26, 2019 | superuser.com

rm -rf return codes Ask Question Asked 6 years ago Active 6 years ago Viewed 15k times 8 0


SheetJS ,Aug 15, 2013 at 2:50

Any one can let me know the possible return codes for the command rm -rf other than zero i.e, possible return codes for failure cases. I want to know more detailed reason for the failure of the command unlike just the command is failed(return other than 0).

Adrian Frühwirth ,Aug 14, 2013 at 7:00

To see the return code, you can use echo $? in bash.

To see the actual meaning, some platforms (like Debian Linux) have the perror binary available, which can be used as follows:

$ rm -rf something/; perror $?
rm: cannot remove `something/': Permission denied
OS error code   1:  Operation not permitted

rm -rf automatically suppresses most errors. The most likely error you will see is 1 (Operation not permitted), which will happen if you don't have permissions to remove the file. -f intentionally suppresses most errors

Adrian Frühwirth ,Aug 14, 2013 at 7:21

grabbed coreutils from git....

looking at exit we see...

openfly@linux-host:~/coreutils/src $ cat rm.c | grep -i exit
  if (status != EXIT_SUCCESS)
  exit (status);
  /* Since this program exits immediately after calling 'rm', rm need not
  atexit (close_stdin);
          usage (EXIT_FAILURE);
        exit (EXIT_SUCCESS);
          usage (EXIT_FAILURE);
        error (EXIT_FAILURE, errno, _("failed to get attributes of %s"),
        exit (EXIT_SUCCESS);
  exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);

Now looking at the status variable....

openfly@linux-host:~/coreutils/src $ cat rm.c | grep -i status
usage (int status)
  if (status != EXIT_SUCCESS)
  exit (status);
  enum RM_status status = rm (file, &x);
  assert (VALID_STATUS (status));
  exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);

looks like there isn't much going on there with the exit status.

I see EXIT_FAILURE and EXIT_SUCCESS and not anything else.

so basically 0 and 1 / -1

To see specific exit() syscalls and how they occur in a process flow try this

openfly@linux-host:~/ $ strace rm -rf $whatever

fairly simple.

ref:

http://www.unix.com/man-page/Linux/EXIT_FAILURE/exit/

[Aug 26, 2019] debugging - How can I debug a Perl script - Stack Overflow

Jun 27, 2014 | stackoverflow.com

Matthew Lock ,Jun 27, 2014 at 1:01

To run your script under perl debugger you should use -d switch:
perl -d script.pl

But perl is flexible. It supply some hooks and you may force debugger to work as you want

So to use different debuggers you may do:

perl -d:DebugHooks::Terminal script.pl
# OR
perl -d:Trepan script.pl

Look these modules here and here

There are several most interesting perl modules that hook into perl debugger internals: Devel::NYTProf , Devel::Cover

And many others

XXX,

If you want to do remote debug (for cgi or if you don't want to mess output with debug command line) use this:

given test:

use v5.14;
say 1;
say 2;
say 3;

Start a listener on whatever host and port on terminal 1 (here localhost:12345):

$ nc -v -l localhost -p 12345

for readline support use rlwrap (you can use on perl -d too):

$ rlwrap nc -v -l localhost -p 12345

And start the test on another terminal (say terminal 2):

$ PERLDB_OPTS="RemotePort=localhost:12345" perl -d test

Input/Output on terminal 1:

Connection from 127.0.0.1:42994

Loading DB routines from perl5db.pl version 1.49
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(test:2): say 1;
  DB<1> n
main::(test:3): say 2;
  DB<1> select $DB::OUT

  DB<2> n
2
main::(test:4): say 3;
  DB<2> n
3
Debugged program terminated.  Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.  
  DB<2>

Output on terminal 2:

1

Note the sentence if you want output on debug terminal

select $DB::OUT

If you are vim user, install this plugin: dbg.vim which provides basic support for perl

[Aug 26, 2019] D>ebugging - How to use the Perl debugger

Aug 26, 2019 | stackoverflow.com
This is like "please can you give me an example how to drive a car" .

I have explained the basic commands that you will use most often. Beyond this you must read the debugger's inline help and reread the perldebug documentation

The debugger will do a lot more than this, but these are the basic commands that you need to know. You should experiment with them and look at the contents of the help text to get more proficient with the Perl debugger

[Aug 25, 2019] How to check if a variable is set in Bash?

Aug 25, 2019 | stackoverflow.com

Ask Question Asked 8 years, 11 months ago Active 2 months ago Viewed 1.1m times 1339 435


Jens ,Jul 15, 2014 at 9:46

How do I know if a variable is set in Bash?

For example, how do I check if the user gave the first parameter to a function?

function a {
    # if $1 is set ?
}

Graeme ,Nov 25, 2016 at 5:07

(Usually) The right way
if [ -z ${var+x} ]; then echo "var is unset"; else echo "var is set to '$var'"; fi

where ${var+x} is a parameter expansion which evaluates to nothing if var is unset, and substitutes the string x otherwise.

Quotes Digression

Quotes can be omitted (so we can say ${var+x} instead of "${var+x}" ) because this syntax & usage guarantees this will only expand to something that does not require quotes (since it either expands to x (which contains no word breaks so it needs no quotes), or to nothing (which results in [ -z ] , which conveniently evaluates to the same value (true) that [ -z "" ] does as well)).

However, while quotes can be safely omitted, and it was not immediately obvious to all (it wasn't even apparent to the first author of this quotes explanation who is also a major Bash coder), it would sometimes be better to write the solution with quotes as [ -z "${var+x}" ] , at the very small possible cost of an O(1) speed penalty. The first author also added this as a comment next to the code using this solution giving the URL to this answer, which now also includes the explanation for why the quotes can be safely omitted.

(Often) The wrong way
if [ -z "$var" ]; then echo "var is blank"; else echo "var is set to '$var'"; fi

This is often wrong because it doesn't distinguish between a variable that is unset and a variable that is set to the empty string. That is to say, if var='' , then the above solution will output "var is blank".

The distinction between unset and "set to the empty string" is essential in situations where the user has to specify an extension, or additional list of properties, and that not specifying them defaults to a non-empty value, whereas specifying the empty string should make the script use an empty extension or list of additional properties.

The distinction may not be essential in every scenario though. In those cases [ -z "$var" ] will be just fine.

Flow ,Nov 26, 2014 at 13:49

To check for non-null/non-zero string variable, i.e. if set, use
if [ -n "$1" ]

It's the opposite of -z . I find myself using -n more than -z .

You would use it like:

if [ -n "$1" ]; then
  echo "You supplied the first parameter!"
else
  echo "First parameter not supplied."
fi

Jens ,Jan 19, 2016 at 23:30

Here's how to test whether a parameter is unset , or empty ("Null") or set with a value :
+--------------------+----------------------+-----------------+-----------------+
|                    |       parameter      |     parameter   |    parameter    |
|                    |   Set and Not Null   |   Set But Null  |      Unset      |
+--------------------+----------------------+-----------------+-----------------+
| ${parameter:-word} | substitute parameter | substitute word | substitute word |
| ${parameter-word}  | substitute parameter | substitute null | substitute word |
| ${parameter:=word} | substitute parameter | assign word     | assign word     |
| ${parameter=word}  | substitute parameter | substitute null | assign word     |
| ${parameter:?word} | substitute parameter | error, exit     | error, exit     |
| ${parameter?word}  | substitute parameter | substitute null | error, exit     |
| ${parameter:+word} | substitute word      | substitute null | substitute null |
| ${parameter+word}  | substitute word      | substitute word | substitute null |
+--------------------+----------------------+-----------------+-----------------+

Source: POSIX: Parameter Expansion :

In all cases shown with "substitute", the expression is replaced with the value shown. In all cases shown with "assign", parameter is assigned that value, which also replaces the expression.

Dan ,Jul 24, 2018 at 20:16

While most of the techniques stated here are correct, bash 4.2 supports an actual test for the presence of a variable ( man bash ), rather than testing the value of the variable.
[[ -v foo ]]; echo $?
# 1

foo=bar
[[ -v foo ]]; echo $?
# 0

foo=""
[[ -v foo ]]; echo $?
# 0

Notably, this approach will not cause an error when used to check for an unset variable in set -u / set -o nounset mode, unlike many other approaches, such as using [ -z .

chepner ,Sep 11, 2013 at 14:22

There are many ways to do this with the following being one of them:
if [ -z "$1" ]

This succeeds if $1 is null or unset

phkoester ,Feb 16, 2018 at 8:06

To see if a variable is nonempty, I use
if [[ $var ]]; then ...       # `$var' expands to a nonempty string

The opposite tests if a variable is either unset or empty:

if [[ ! $var ]]; then ...     # `$var' expands to the empty string (set or not)

To see if a variable is set (empty or nonempty), I use

if [[ ${var+x} ]]; then ...   # `var' exists (empty or nonempty)
if [[ ${1+x} ]]; then ...     # Parameter 1 exists (empty or nonempty)

The opposite tests if a variable is unset:

if [[ ! ${var+x} ]]; then ... # `var' is not set at all
if [[ ! ${1+x} ]]; then ...   # We were called with no arguments

Palec ,Jun 19, 2017 at 3:25

I always find the POSIX table in the other answer slow to grok, so here's my take on it:
   +----------------------+------------+-----------------------+-----------------------+
   |   if VARIABLE is:    |    set     |         empty         |        unset          |
   +----------------------+------------+-----------------------+-----------------------+
 - |  ${VARIABLE-default} | $VARIABLE  |          ""           |       "default"       |
 = |  ${VARIABLE=default} | $VARIABLE  |          ""           | $(VARIABLE="default") |
 ? |  ${VARIABLE?default} | $VARIABLE  |          ""           |       exit 127        |
 + |  ${VARIABLE+default} | "default"  |       "default"       |          ""           |
   +----------------------+------------+-----------------------+-----------------------+
:- | ${VARIABLE:-default} | $VARIABLE  |       "default"       |       "default"       |
:= | ${VARIABLE:=default} | $VARIABLE  | $(VARIABLE="default") | $(VARIABLE="default") |
:? | ${VARIABLE:?default} | $VARIABLE  |       exit 127        |       exit 127        |
:+ | ${VARIABLE:+default} | "default"  |          ""           |          ""           |
   +----------------------+------------+-----------------------+-----------------------+

Note that each group (with and without preceding colon) has the same set and unset cases, so the only thing that differs is how the empty cases are handled.

With the preceding colon, the empty and unset cases are identical, so I would use those where possible (i.e. use := , not just = , because the empty case is inconsistent).

Headings:

Values:

chepner ,Mar 28, 2017 at 12:26

On a modern version of Bash (4.2 or later I think; I don't know for sure), I would try this:
if [ ! -v SOMEVARIABLE ] #note the lack of a $ sigil
then
    echo "Variable is unset"
elif [ -z "$SOMEVARIABLE" ]
then
    echo "Variable is set to an empty string"
else
    echo "Variable is set to some string"
fi

Gordon Davisson ,May 15, 2015 at 13:53

if [ "$1" != "" ]; then
  echo \$1 is set
else
  echo \$1 is not set
fi

Although for arguments it is normally best to test $#, which is the number of arguments, in my opinion.

if [ $# -gt 0 ]; then
  echo \$1 is set
else
  echo \$1 is not set
fi

Jarrod Chesney ,Dec 9, 2016 at 3:34

You want to exit if it's unset

This worked for me. I wanted my script to exit with an error message if a parameter wasn't set.

#!/usr/bin/env bash

set -o errexit

# Get the value and empty validation check all in one
VER="${1:?You must pass a version of the format 0.0.0 as the only argument}"

This returns with an error when it's run

peek@peek:~$ ./setver.sh
./setver.sh: line 13: 1: You must pass a version of the format 0.0.0 as the only argument
Check only, no exit - Empty and Unset are INVALID

Try this option if you just want to check if the value set=VALID or unset/empty=INVALID.

TSET="good val"
TEMPTY=""
unset TUNSET

if [ "${TSET:-}" ]; then echo "VALID"; else echo "INVALID";fi
# VALID
if [ "${TEMPTY:-}" ]; then echo "VALID"; else echo "INVALID";fi
# INVALID
if [ "${TUNSET:-}" ]; then echo "VALID"; else echo "INVALID";fi
# INVALID

Or, Even short tests ;-)

[ "${TSET:-}"   ] && echo "VALID" || echo "INVALID"
[ "${TEMPTY:-}" ] && echo "VALID" || echo "INVALID"
[ "${TUNSET:-}" ] && echo "VALID" || echo "INVALID"
Check only, no exit - Only empty is INVALID

And this is the answer to the question. Use this if you just want to check if the value set/empty=VALID or unset=INVALID.

NOTE, the "1" in "..-1}" is insignificant, it can be anything (like x)

TSET="good val"
TEMPTY=""
unset TUNSET

if [ "${TSET+1}" ]; then echo "VALID"; else echo "INVALID";fi
# VALID
if [ "${TEMPTY+1}" ]; then echo "VALID"; else echo "INVALID";fi
# VALID
if [ "${TUNSET+1}" ]; then echo "VALID"; else echo "INVALID";fi
# INVALID

Short tests

[ "${TSET+1}"   ] && echo "VALID" || echo "INVALID"
[ "${TEMPTY+1}" ] && echo "VALID" || echo "INVALID"
[ "${TUNSET+1}" ] && echo "VALID" || echo "INVALID"

I dedicate this answer to @mklement0 (comments) who challenged me to answer the question accurately.

Reference http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

Gilles ,Aug 31, 2010 at 7:30

To check whether a variable is set with a non-empty value, use [ -n "$x" ] , as others have already indicated.

Most of the time, it's a good idea to treat a variable that has an empty value in the same way as a variable that is unset. But you can distinguish the two if you need to: [ -n "${x+set}" ] ( "${x+set}" expands to set if x is set and to the empty string if x is unset).

To check whether a parameter has been passed, test $# , which is the number of parameters passed to the function (or to the script, when not in a function) (see Paul's answer ).

tripleee ,Sep 12, 2015 at 6:33

Read the "Parameter Expansion" section of the bash man page. Parameter expansion doesn't provide a general test for a variable being set, but there are several things you can do to a parameter if it isn't set.

For example:

function a {
    first_arg=${1-foo}
    # rest of the function
}

will set first_arg equal to $1 if it is assigned, otherwise it uses the value "foo". If a absolutely must take a single parameter, and no good default exists, you can exit with an error message when no parameter is given:

function a {
    : ${1?a must take a single argument}
    # rest of the function
}

(Note the use of : as a null command, which just expands the values of its arguments. We don't want to do anything with $1 in this example, just exit if it isn't set)

AlejandroVD ,Feb 8, 2016 at 13:31

In bash you can use -v inside the [[ ]] builtin:
#! /bin/bash -u

if [[ ! -v SOMEVAR ]]; then
    SOMEVAR='hello'
fi

echo $SOMEVAR

Palec ,Nov 16, 2016 at 15:01

For those that are looking to check for unset or empty when in a script with set -u :
if [ -z "${var-}" ]; then
   echo "Must provide var environment variable. Exiting...."
   exit 1
fi

The regular [ -z "$var" ] check will fail with var; unbound variable if set -u but [ -z "${var-}" ] expands to empty string if var is unset without failing.

user1387866 ,Jul 30 at 15:57

Note

I'm giving a heavily Bash-focused answer because of the bash tag.

Short answer

As long as you're only dealing with named variables in Bash, this function should always tell you if the variable has been set, even if it's an empty array.

is-variable-set() {
    declare -p $1 &>dev/null
}
Why this works

In Bash (at least as far back as 3.0), if var is a declared/set variable, then declare -p var outputs a declare command that would set variable var to whatever its current type and value are, and returns status code 0 (success). If var is undeclared, then declare -p var outputs an error message to stderr and returns status code 1 . Using &>/dev/null , redirects both regular stdout and stderr output to /dev/null , never to be seen, and without changing the status code. Thus the function only returns the status code.

Why other methods (sometimes) fail in Bash
  • [ -n "$var" ] : This only checks if ${var[0]} is nonempty. (In Bash, $var is the same as ${var[0]} .)
  • [ -n "${var+x}" ] : This only checks if ${var[0]} is set.
  • [ "${#var[@]}" != 0 ] : This only checks if at least one index of $var is set.
When this method fails in Bash

This only works for named variables (including $_ ), not certain special variables ( $! , $@ , $# , $$ , $* , $? , $- , $0 , $1 , $2 , ..., and any I may have forgotten). Since none of these are arrays, the POSIX-style [ -n "${var+x}" ] works for all of these special variables. But beware of wrapping it in a function since many special variables change values/existence when functions are called.

Shell compatibility note

If your script has arrays and you're trying to make it compatible with as many shells as possible, then consider using typeset -p instead of declare -p . I've read that ksh only supports the former, but haven't been able to test this. I do know that Bash 3.0+ and Zsh 5.5.1 each support both typeset -p and declare -p , differing only in which one is an alternative for the other. But I haven't tested differences beyond those two keywords, and I haven't tested other shells.

If you need your script to be POSIX sh compatible, then you can't use arrays. Without arrays, [ -n "{$var+x}" ] works.

Comparison code for different methods in Bash

This function unsets variable var , eval s the passed code, runs tests to determine if var is set by the eval d code, and finally shows the resulting status codes for the different tests.

I'm skipping test -v var , [ -v var ] , and [[ -v var ]] because they yield identical results to the POSIX standard [ -n "${var+x}" ] , while requiring Bash 4.2+. I'm also skipping typeset -p because it's the same as declare -p in the shells I've tested (Bash 3.0 thru 5.0, and Zsh 5.5.1).

is-var-set-after() {
    # Set var by passed expression.
    unset var
    eval "$1"

    # Run the tests, in increasing order of accuracy.
    [ -n "$var" ] # (index 0 of) var is nonempty
    nonempty=$?
    [ -n "${var+x}" ] # (index 0 of) var is set, maybe empty
    plus=$?
    [ "${#var[@]}" != 0 ] # var has at least one index set, maybe empty
    count=$?
    declare -p var &>/dev/null # var has been declared (any type)
    declared=$?

    # Show test results.
    printf '%30s: %2s %2s %2s %2s\n' "$1" $nonempty $plus $count $declared
}
Test case code

Note that test results may be unexpected due to Bash treating non-numeric array indices as "0" if the variable hasn't been declared as an associative array. Also, associative arrays are only valid in Bash 4.0+.

# Header.
printf '%30s: %2s %2s %2s %2s\n' "test" '-n' '+x' '#@' '-p'
# First 5 tests: Equivalent to setting 'var=foo' because index 0 of an
# indexed array is also the nonindexed value, and non-numerical
# indices in an array not declared as associative are the same as
# index 0.
is-var-set-after "var=foo"                        #  0  0  0  0
is-var-set-after "var=(foo)"                      #  0  0  0  0
is-var-set-after "var=([0]=foo)"                  #  0  0  0  0
is-var-set-after "var=([x]=foo)"                  #  0  0  0  0
is-var-set-after "var=([y]=bar [x]=foo)"          #  0  0  0  0
# '[ -n "$var" ]' fails when var is empty.
is-var-set-after "var=''"                         #  1  0  0  0
is-var-set-after "var=([0]='')"                   #  1  0  0  0
# Indices other than 0 are not detected by '[ -n "$var" ]' or by
# '[ -n "${var+x}" ]'.
is-var-set-after "var=([1]='')"                   #  1  1  0  0
is-var-set-after "var=([1]=foo)"                  #  1  1  0  0
is-var-set-after "declare -A var; var=([x]=foo)"  #  1  1  0  0
# Empty arrays are only detected by 'declare -p'.
is-var-set-after "var=()"                         #  1  1  1  0
is-var-set-after "declare -a var"                 #  1  1  1  0
is-var-set-after "declare -A var"                 #  1  1  1  0
# If 'var' is unset, then it even fails the 'declare -p var' test.
is-var-set-after "unset var"                      #  1  1  1  1
Test output

The test mnemonics in the header row correspond to [ -n "$var" ] , [ -n "${var+x}" ] , [ "${#var[@]}" != 0 ] , and declare -p var , respectively.

                         test: -n +x #@ -p
                      var=foo:  0  0  0  0
                    var=(foo):  0  0  0  0
                var=([0]=foo):  0  0  0  0
                var=([x]=foo):  0  0  0  0
        var=([y]=bar [x]=foo):  0  0  0  0
                       var='':  1  0  0  0
                 var=([0]=''):  1  0  0  0
                 var=([1]=''):  1  1  0  0
                var=([1]=foo):  1  1  0  0
declare -A var; var=([x]=foo):  1  1  0  0
                       var=():  1  1  1  0
               declare -a var:  1  1  1  0
               declare -A var:  1  1  1  0
                    unset var:  1  1  1  1
Summary
  • declare -p var &>/dev/null is (100%?) reliable for testing named variables in Bash since at least 3.0.
  • [ -n "${var+x}" ] is reliable in POSIX compliant situations, but cannot handle arrays.
  • Other tests exist for checking if a variable is nonempty, and for checking for declared variables in other shells. But these tests are suited for neither Bash nor POSIX scripts.

Peregring-lk ,Oct 18, 2014 at 22:09

Using [[ -z "$var" ]] is the easiest way to know if a variable was set or not, but that option -z doesn't distinguish between an unset variable and a variable set to an empty string:
$ set=''
$ [[ -z "$set" ]] && echo "Set" || echo "Unset" 
Unset
$ [[ -z "$unset" ]] && echo "Set" || echo "Unset"
Unset

It's best to check it according to the type of variable: env variable, parameter or regular variable.

For a env variable:

[[ $(env | grep "varname=" | wc -l) -eq 1 ]] && echo "Set" || echo "Unset"

For a parameter (for example, to check existence of parameter $5 ):

[[ $# -ge 5 ]] && echo "Set" || echo "Unset"

For a regular variable (using an auxiliary function, to do it in an elegant way):

function declare_var {
   declare -p "$1" &> /dev/null
}
declare_var "var_name" && echo "Set" || echo "Unset"

Notes:

  • $# : gives you the number of positional parameters.
  • declare -p : gives you the definition of the variable passed as a parameter. If it exists, returns 0, if not, returns 1 and prints an error message.
  • &> /dev/null : suppresses output from declare -p without affecting its return code.

Dennis Williamson ,Nov 27, 2013 at 20:56

You can do:
function a {
        if [ ! -z "$1" ]; then
                echo '$1 is set'
        fi
}

LavaScornedOven ,May 11, 2017 at 13:14

The answers above do not work when Bash option set -u is enabled. Also, they are not dynamic, e.g., how to test is variable with name "dummy" is defined? Try this:
is_var_defined()
{
    if [ $# -ne 1 ]
    then
        echo "Expected exactly one argument: variable name as string, e.g., 'my_var'"
        exit 1
    fi
    # Tricky.  Since Bash option 'set -u' may be enabled, we cannot directly test if a variable
    # is defined with this construct: [ ! -z "$var" ].  Instead, we must use default value
    # substitution with this construct: [ ! -z "${var:-}" ].  Normally, a default value follows the
    # operator ':-', but here we leave it blank for empty (null) string.  Finally, we need to
    # substitute the text from $1 as 'var'.  This is not allowed directly in Bash with this
    # construct: [ ! -z "${$1:-}" ].  We need to use indirection with eval operator.
    # Example: $1="var"
    # Expansion for eval operator: "[ ! -z \${$1:-} ]" -> "[ ! -z \${var:-} ]"
    # Code  execute: [ ! -z ${var:-} ]
    eval "[ ! -z \${$1:-} ]"
    return $?  # Pedantic.
}

Related: In Bash, how do I test if a variable is defined in "-u" mode

Aquarius Power ,Nov 15, 2014 at 17:55

My prefered way is this:
$var=10
$if ! ${var+false};then echo "is set";else echo "NOT set";fi
is set
$unset var
$if ! ${var+false};then echo "is set";else echo "NOT set";fi
NOT set

So basically, if a variable is set, it becomes "a negation of the resulting false " (what will be true = "is set").

And, if it is unset, it will become "a negation of the resulting true " (as the empty result evaluates to true ) (so will end as being false = "NOT set").

kenorb ,Sep 22, 2014 at 13:57

In a shell you can use the -z operator which is True if the length of string is zero.

A simple one-liner to set default MY_VAR if it's not set, otherwise optionally you can display the message:

[[ -z "$MY_VAR" ]] && MY_VAR="default"
[[ -z "$MY_VAR" ]] && MY_VAR="default" || echo "Variable already set."

Zlatan ,Nov 20, 2013 at 18:53

if [[ ${1:+isset} ]]
then echo "It was set and not null." >&2
else echo "It was not set or it was null." >&2
fi

if [[ ${1+isset} ]]
then echo "It was set but might be null." >&2
else echo "It was was not set." >&2
fi

solidsnack ,Nov 30, 2013 at 16:47

I found a (much) better code to do this if you want to check for anything in $@ .
if [[ $1 = "" ]]
then
  echo '$1 is blank'
else
  echo '$1 is filled up'
fi

Why this all? Everything in $@ exists in Bash, but by default it's blank, so test -z and test -n couldn't help you.

Update: You can also count number of characters in a parameters.

if [ ${#1} = 0 ]
then
  echo '$1 is blank'
else
  echo '$1 is filled up'
fi

Steven Penny ,May 11, 2014 at 4:59

[[ $foo ]]

Or

(( ${#foo} ))

Or

let ${#foo}

Or

declare -p foo

Celeo ,Feb 11, 2015 at 20:58

if [[ ${!xx[@]} ]] ; then echo xx is defined; fi

HelloGoodbye ,Nov 29, 2013 at 22:41

I always use this one, based on the fact that it seems easy to be understood by anybody who sees the code for the very first time:
if [ "$variable" = "" ]
    then
    echo "Variable X is empty"
fi

And, if wanting to check if not empty;

if [ ! "$variable" = "" ]
    then
    echo "Variable X is not empty"
fi

That's it.

fr00tyl00p ,Nov 29, 2015 at 20:26

This is what I use every day:
#
# Check if a variable is set
#   param1  name of the variable
#
function is_set()
{
    [[ -n "${1}" ]] && test -n "$(eval "echo "\${${1}+x}"")"
}

This works well under Linux and Solaris down to bash 3.0.

bash-3.00$ myvar="TEST"
bash-3.00$ is_set myvar ; echo $?
0
bash-3.00$ mavar=""
bash-3.00$ is_set myvar ; echo $?
0
bash-3.00$ unset myvar
bash-3.00$ is_set myvar ; echo $?
1

Daniel S ,Mar 1, 2016 at 13:12

I like auxiliary functions to hide the crude details of bash. In this case, doing so adds even more (hidden) crudeness:
# The first ! negates the result (can't use -n to achieve this)
# the second ! expands the content of varname (can't do ${$varname})
function IsDeclared_Tricky
{
  local varname="$1"
  ! [ -z ${!varname+x} ]
}

Because I first had bugs in this implementation (inspired by the answers of Jens and Lionel), I came up with a different solution:

# Ask for the properties of the variable - fails if not declared
function IsDeclared()
{
  declare -p $1 &>/dev/null
}

I find it to be more straight-forward, more bashy and easier to understand/remember. Test case shows it is equivalent:

function main()
{
  declare -i xyz
  local foo
  local bar=
  local baz=''

  IsDeclared_Tricky xyz; echo "IsDeclared_Tricky xyz: $?"
  IsDeclared_Tricky foo; echo "IsDeclared_Tricky foo: $?"
  IsDeclared_Tricky bar; echo "IsDeclared_Tricky bar: $?"
  IsDeclared_Tricky baz; echo "IsDeclared_Tricky baz: $?"

  IsDeclared xyz; echo "IsDeclared xyz: $?"
  IsDeclared foo; echo "IsDeclared foo: $?"
  IsDeclared bar; echo "IsDeclared bar: $?"
  IsDeclared baz; echo "IsDeclared baz: $?"
}

main

The test case also shows that local var does NOT declare var (unless followed by '='). For quite some time I thought i declared variables this way, just to discover now that i merely expressed my intention... It's a no-op, i guess.

IsDeclared_Tricky xyz: 1
IsDeclared_Tricky foo: 1
IsDeclared_Tricky bar: 0
IsDeclared_Tricky baz: 0
IsDeclared xyz: 1
IsDeclared foo: 1
IsDeclared bar: 0
IsDeclared baz: 0

BONUS: usecase

I mostly use this test to give (and return) parameters to functions in a somewhat "elegant" and safe way (almost resembling an interface...):

#auxiliary functions
function die()
{
  echo "Error: $1"; exit 1
}

function assertVariableDeclared()
{
  IsDeclared "$1" || die "variable not declared: $1"
}

function expectVariables()
{
  while (( $# > 0 )); do
    assertVariableDeclared $1; shift
  done
}

# actual example
function exampleFunction()
{
  expectVariables inputStr outputStr
  outputStr="$inputStr world!"
}

function bonus()
{
  local inputStr='Hello'
  local outputStr= # remove this to trigger error
  exampleFunction
  echo $outputStr
}

bonus

If called with all requires variables declared:

Hello world!

else:

Error: variable not declared: outputStr

Hatem Jaber ,Jun 13 at 12:08

After skimming all the answers, this also works:
if [[ -z $SOME_VAR ]]; then read -p "Enter a value for SOME_VAR: " SOME_VAR; fi
echo "SOME_VAR=$SOME_VAR"

if you don't put SOME_VAR instead of what I have $SOME_VAR , it will set it to an empty value; $ is necessary for this to work.

Keith Thompson ,Aug 5, 2013 at 19:10

If you wish to test that a variable is bound or unbound, this works well, even after you've turned on the nounset option:
set -o noun set

if printenv variableName >/dev/null; then
    # variable is bound to a value
else
    # variable is unbound
fi

> ,Jan 30 at 18:23

Functions to check if variable is declared/unset including empty $array=()


The following functions test if the given name exists as a variable

# The first parameter needs to be the name of the variable to be checked.
# (See example below)

var_is_declared() {
    { [[ -n ${!1+anything} ]] || declare -p $1 &>/dev/null;}
}

var_is_unset() {
    { [[ -z ${!1+anything} ]] && ! declare -p $1 &>/dev/null;} 
}

This functions would test as showed in the following conditions:

a;       # is not declared
a=;      # is declared
a="foo"; # is declared
a=();    # is declared
a=("");  # is declared
unset a; # is not declared

a;       # is unset
a=;      # is not unset
a="foo"; # is not unset
a=();    # is not unset
a=("");  # is not unset
unset a; # is unset

.

For more details

and a test script see my answer to the question "How do I check if a variable exists in bash?" .

Remark: The similar usage of declare -p , as it is also shown by Peregring-lk 's answer , is truly coincidental. Otherwise I would of course have credited it!

[Aug 20, 2019] Is it possible to insert separator in midnight commander menu?

Jun 07, 2010 | superuser.com

Ask Question Asked 9 years, 2 months ago Active 7 years, 10 months ago Viewed 363 times 2

okutane ,Jun 7, 2010 at 3:36

I want to insert some items into mc menu (which is opened by F2) grouped together. Is it possible to insert some sort of separator before them or put them into some submenu?
Probably, not.
The format of the menu file is very simple. Lines that start with anything but
space or tab are considered entries for the menu (in order to be able to use
it like a hot key, the first character should be a letter). All the lines that
start with a space or a tab are the commands that will be executed when the
entry is selected.

But MC allows you to make multiple menu entries with same shortcut and title, so you can make a menu entry that looks like separator and does nothing, like:

a hello
  echo world
- --------
b world
  echo hello
- --------
c superuser
  ls /

This will look like:

[Aug 20, 2019] Midnight Commander, using date in User menu

Dec 31, 2013 | unix.stackexchange.com

user2013619 ,Dec 31, 2013 at 0:43

I would like to use MC (midnight commander) to compress the selected dir with date in its name, e.g: dirname_20131231.tar.gz

The command in the User menu is :

tar -czf dirname_`date '+%Y%m%d'`.tar.gz %d

The archive is missing because %m , and %d has another meaning in MC. I made an alias for the date, but it also doesn't work.

Does anybody solved this problem ever?

John1024 ,Dec 31, 2013 at 1:06

To escape the percent signs, double them:
tar -czf dirname_$(date '+%%Y%%m%%d').tar.gz %d

The above would compress the current directory (%d) to a file also in the current directory. If you want to compress the directory pointed to by the cursor rather than the current directory, use %f instead:

tar -czf %f_$(date '+%%Y%%m%%d').tar.gz %f

mc handles escaping of special characters so there is no need to put %f in quotes.

By the way, midnight commander's special treatment of percent signs occurs not just in the user menu file but also at the command line. This is an issue when using shell commands with constructs like ${var%.c} . At the command line, the same as in the user menu file, percent signs can be escaped by doubling them.

[Aug 19, 2019] mc - Is there are any documentation about user-defined menu in midnight-commander - Unix Linux Stack Exchange

Aug 19, 2019 | unix.stackexchange.com

Is there are any documentation about user-defined menu in midnight-commander? Ask Question Asked 5 years, 2 months ago Active 1 year, 2 months ago Viewed 3k times 6 2


login ,Jun 11, 2014 at 13:13

I'd like to create my own user-defined menu for mc ( menu file). I see some lines like
+ t r & ! t t

or

+ t t

What does it mean?

goldilocks ,Jun 11, 2014 at 13:35

It is documented in the help, the node is "Edit Menu File" under "Command Menu"; if you scroll down you should find "Addition Conditions":

If the condition begins with '+' (or '+?') instead of '=' (or '=?') it is an addition condition. If the condition is true the menu entry will be included in the menu. If the condition is false the menu entry will not be included in the menu.

This is preceded by "Default conditions" (the = condition), which determine which entry will be highlighted as the default choice when the menu appears. Anyway, by way of example:

+ t r & ! t t

t r means if this is a regular file ("t(ype) r"), and ! t t means if the file has not been tagged in the interface.

Jarek

On top what has been written above, this page can be browsed in the Internet, when searching for man pages, e.g.: https://www.systutorials.com/docs/linux/man/1-mc/

Search for "Menu File Edit" .

Best regards, Jarek

[Aug 14, 2019] bash - PID background process - Unix Linux Stack Exchange

Aug 14, 2019 | unix.stackexchange.com

PID background process Ask Question Asked 2 years, 8 months ago Active 2 years, 8 months ago Viewed 2k times 2


Raul ,Nov 27, 2016 at 18:21

As I understand pipes and commands, bash takes each command, spawns a process for each one and connects stdout of the previous one with the stdin of the next one.

For example, in "ls -lsa | grep feb", bash will create two processes, and connect the output of "ls -lsa" to the input of "grep feb".

When you execute a background command like "sleep 30 &" in bash, you get the pid of the background process running your command. Surprisingly for me, when I wrote "ls -lsa | grep feb &" bash returned only one PID.

How should this be interpreted? A process runs both "ls -lsa" and "grep feb"? Several process are created but I only get the pid of one of them?

Raul ,Nov 27, 2016 at 19:21

Spawns 2 processes. The & displays the PID of the second process. Example below.
$ echo $$
13358
$ sleep 100 | sleep 200 &
[1] 13405
$ ps -ef|grep 13358
ec2-user 13358 13357  0 19:02 pts/0    00:00:00 -bash
ec2-user 13404 13358  0 19:04 pts/0    00:00:00 sleep 100
ec2-user 13405 13358  0 19:04 pts/0    00:00:00 sleep 200
ec2-user 13406 13358  0 19:04 pts/0    00:00:00 ps -ef
ec2-user 13407 13358  0 19:04 pts/0    00:00:00 grep --color=auto 13358
$

> ,

When you run a job in the background, bash prints the process ID of its subprocess, the one that runs the command in that job. If that job happens to create more subprocesses, that's none of the parent shell's business.

When the background job is a pipeline (i.e. the command is of the form something1 | something2 & , and not e.g. { something1 | something2; } & ), there's an optimization which is strongly suggested by POSIX and performed by most shells including bash: each of the elements of the pipeline are executed directly as subprocesses of the original shell. What POSIX mandates is that the variable $! is set to the last command in the pipeline in this case. In most shells, that last command is a subprocess of the original process, and so are the other commands in the pipeline.

When you run ls -lsa | grep feb , there are three processes involved: the one that runs the left-hand side of the pipe (a subshell that finishes setting up the pipe then executes ls ), the one that runs the right-hand side of the pipe (a subshell that finishes setting up the pipe then executes grep ), and the original process that waits for the pipe to finish.

You can watch what happens by tracing the processes:

$ strace -f -e clone,wait4,pipe,execve,setpgid bash --norc
execve("/usr/local/bin/bash", ["bash", "--norc"], [/* 82 vars */]) = 0
setpgid(0, 24084)                       = 0
bash-4.3$ sleep 10 | sleep 20 &

Note how the second sleep is reported and stored as $! , but the process group ID is the first sleep . Dash has the same oddity, ksh and mksh don't.

[Aug 14, 2019] unix - How to get PID of process by specifying process name and store it in a variable to use further - Stack Overflow

Aug 14, 2019 | stackoverflow.com

Nidhi ,Nov 28, 2014 at 0:54

pids=$(pgrep <name>)

will get you the pids of all processes with the given name. To kill them all, use

kill -9 $pids

To refrain from using a variable and directly kill all processes with a given name issue

pkill -9 <name>

panticz.de ,Nov 11, 2016 at 10:11

On a single line...
pgrep -f process_name | xargs kill -9

flazzarini ,Jun 13, 2014 at 9:54

Another possibility would be to use pidof it usually comes with most distributions. It will return you the PID of a given process by using it's name.
pidof process_name

This way you could store that information in a variable and execute kill -9 on it.

#!/bin/bash
pid=`pidof process_name`
kill -9 $pid

Pawel K ,Dec 20, 2017 at 10:27

use grep [n]ame to remove that grep -v name this is first... Sec using xargs in the way how it is up there is wrong to rnu whatever it is piped you have to use -i ( interactive mode) otherwise you may have issues with the command.

ps axf | grep | grep -v grep | awk '{print "kill -9 " $1}' ? ps aux |grep [n]ame | awk '{print "kill -9 " $2}' ? isnt that better ?

[Aug 14, 2019] linux - How to get PID of background process - Stack Overflow

Highly recommended!
Aug 14, 2019 | stackoverflow.com

How to get PID of background process? Ask Question Asked 9 years, 8 months ago Active 7 months ago Viewed 238k times 336 64


pixelbeat ,Mar 20, 2013 at 9:11

I start a background process from my shell script, and I would like to kill this process when my script finishes.

How to get the PID of this process from my shell script? As far as I can see variable $! contains the PID of the current script, not the background process.

WiSaGaN ,Jun 2, 2015 at 14:40

You need to save the PID of the background process at the time you start it:
foo &
FOO_PID=$!
# do other stuff
kill $FOO_PID

You cannot use job control, since that is an interactive feature and tied to a controlling terminal. A script will not necessarily have a terminal attached at all so job control will not necessarily be available.

Phil ,Dec 2, 2017 at 8:01

You can use the jobs -l command to get to a particular jobL
^Z
[1]+  Stopped                 guard

my_mac:workspace r$ jobs -l
[1]+ 46841 Suspended: 18           guard

In this case, 46841 is the PID.

From help jobs :

-l Report the process group ID and working directory of the jobs.

jobs -p is another option which shows just the PIDs.

Timo ,Dec 2, 2017 at 8:03

Here's a sample transcript from a bash session ( %1 refers to the ordinal number of background process as seen from jobs ):

$ echo $$
3748

$ sleep 100 &
[1] 192

$ echo $!
192

$ kill %1

[1]+  Terminated              sleep 100

lepe ,Dec 2, 2017 at 8:29

An even simpler way to kill all child process of a bash script:
pkill -P $$

The -P flag works the same way with pkill and pgrep - it gets child processes, only with pkill the child processes get killed and with pgrep child PIDs are printed to stdout.

Luis Ramirez ,Feb 20, 2013 at 23:11

this is what I have done. Check it out, hope it can help.
#!/bin/bash
#
# So something to show.
echo "UNO" >  UNO.txt
echo "DOS" >  DOS.txt
#
# Initialize Pid List
dPidLst=""
#
# Generate background processes
tail -f UNO.txt&
dPidLst="$dPidLst $!"
tail -f DOS.txt&
dPidLst="$dPidLst $!"
#
# Report process IDs
echo PID=$$
echo dPidLst=$dPidLst
#
# Show process on current shell
ps -f
#
# Start killing background processes from list
for dPid in $dPidLst
do
        echo killing $dPid. Process is still there.
        ps | grep $dPid
        kill $dPid
        ps | grep $dPid
        echo Just ran "'"ps"'" command, $dPid must not show again.
done

Then just run it as: ./bgkill.sh with proper permissions of course

root@umsstd22 [P]:~# ./bgkill.sh
PID=23757
dPidLst= 23758 23759
UNO
DOS
UID        PID  PPID  C STIME TTY          TIME CMD
root      3937  3935  0 11:07 pts/5    00:00:00 -bash
root     23757  3937  0 11:55 pts/5    00:00:00 /bin/bash ./bgkill.sh
root     23758 23757  0 11:55 pts/5    00:00:00 tail -f UNO.txt
root     23759 23757  0 11:55 pts/5    00:00:00 tail -f DOS.txt
root     23760 23757  0 11:55 pts/5    00:00:00 ps -f
killing 23758. Process is still there.
23758 pts/5    00:00:00 tail
./bgkill.sh: line 24: 23758 Terminated              tail -f UNO.txt
Just ran 'ps' command, 23758 must not show again.
killing 23759. Process is still there.
23759 pts/5    00:00:00 tail
./bgkill.sh: line 24: 23759 Terminated              tail -f DOS.txt
Just ran 'ps' command, 23759 must not show again.
root@umsstd22 [P]:~# ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
root      3937  3935  0 11:07 pts/5    00:00:00 -bash
root     24200  3937  0 11:56 pts/5    00:00:00 ps -f

Phil ,Oct 15, 2013 at 18:22

You might also be able to use pstree:
pstree -p user

This typically gives a text representation of all the processes for the "user" and the -p option gives the process-id. It does not depend, as far as I understand, on having the processes be owned by the current shell. It also shows forks.

Phil ,Dec 4, 2018 at 9:46

pgrep can get you all of the child PIDs of a parent process. As mentioned earlier $$ is the current scripts PID. So, if you want a script that cleans up after itself, this should do the trick:
trap 'kill $( pgrep -P $$ | tr "\n" " " )' SIGINT SIGTERM EXIT

[Aug 10, 2019] How to check the file size in Linux-Unix bash shell scripting by Vivek Gite

Aug 10, 2019 | www.cyberciti.biz

The stat command shows information about the file. The syntax is as follows to get the file size on GNU/Linux stat:

stat -c %s "/etc/passwd"

OR

stat --format=%s "/etc/passwd"

[Aug 10, 2019] bash - How to check size of a file - Stack Overflow

Aug 10, 2019 | stackoverflow.com

[ -n file.txt ] doesn't check its size , it checks that the string file.txt is non-zero length, so it will always succeed.

If you want to say " size is non-zero", you need [ -s file.txt ] .

To get a file's size , you can use wc -c to get the size ( file length) in bytes:

file=file.txt
minimumsize=90000
actualsize=$(wc -c <"$file")
if [ $actualsize -ge $minimumsize ]; then
    echo size is over $minimumsize bytes
else
    echo size is under $minimumsize bytes
fi

In this case, it sounds like that's what you want.

But FYI, if you want to know how much disk space the file is using, you could use du -k to get the size (disk space used) in kilobytes:

file=file.txt
minimumsize=90
actualsize=$(du -k "$file" | cut -f 1)
if [ $actualsize -ge $minimumsize ]; then
    echo size is over $minimumsize kilobytes
else
    echo size is under $minimumsize kilobytes
fi

If you need more control over the output format, you can also look at stat . On Linux, you'd start with something like stat -c '%s' file.txt , and on BSD/Mac OS X, something like stat -f '%z' file.txt .

--Mikel

On Linux, you'd start with something like stat -c '%s' file.txt , and on BSD/Mac OS X, something like stat -f '%z' file.txt .

Oz Solomon ,Jun 13, 2014 at 21:44

It surprises me that no one mentioned stat to check file size. Some methods are definitely better: using -s to find out whether the file is empty or not is easier than anything else if that's all you want. And if you want to find files of a size, then find is certainly the way to go.

I also like du a lot to get file size in kb, but, for bytes, I'd use stat :

size=$(stat -f%z $filename) # BSD stat

size=$(stat -c%s $filename) # GNU stat?
alternative solution with awk and double parenthesis:
FILENAME=file.txt
SIZE=$(du -sb $FILENAME | awk '{ print $1 }')

if ((SIZE<90000)) ; then 
    echo "less"; 
else 
    echo "not less"; 
fi

[Aug 10, 2019] command line - How do I add file and directory comparision option to mc user menu - Ask Ubuntu

Aug 10, 2019 | askubuntu.com

How do I add file and directory comparision option to mc user menu? Ask Question Asked 7 years, 4 months ago Active 7 years, 3 months ago Viewed 664 times 0

sorin ,Mar 30, 2012 at 8:57

I want to add Beyond Compare diff to mc (midnight commmander) user menu.

All I know is that I need to add my custom command to ~/.mc/menu but I have no idea about the syntax to use.

I want to be able to compare two files from the two panes or the directories themselves.

The command that I need to run is bcompare file1 file2 & (some for directories, it will figure it out).

mivk ,Oct 17, 2015 at 15:35

Add this to ~/.mc/menu :
+ t r & ! t t
d       Diff against file of same name in other directory
        if [ "%d" = "%D" ]; then
          echo "The two directores must be different"
          exit 1
        fi
        if [ -f %D/%f ]; then        # if two of them, then
          bcompare %f %D/%f &
        else
          echo %f: No copy in %D/%f
        fi

x       Diff file to file
        if [ -f %D/%F ]; then        # if two of them, then
          bcompare %f %D/%F &
        else
          echo %f: No copy in %D/%f
        fi

D       Diff current directory against other directory
        if [ "%d" = "%D" ]; then
          echo "The two directores must be different"
          exit 1
        fi
        bcompare %d %D &

[Aug 10, 2019] mc - Is there are any documentation about user-defined menu in midnight-commander - Unix Linux Stack Exchange

Aug 10, 2019 | unix.stackexchange.com

Is there are any documentation about user-defined menu in midnight-commander? Ask Question Asked 5 years, 2 months ago Active 1 year, 1 month ago Viewed 3k times 6 2


login ,Jun 11, 2014 at 13:13

I'd like to create my own user-defined menu for mc ( menu file). I see some lines like
+ t r & ! t t

or

+ t t

What does it mean?

goldilocks ,Jun 11, 2014 at 13:35

It is documented in the help, the node is "Edit Menu File" under "Command Menu"; if you scroll down you should find "Addition Conditions":

If the condition begins with '+' (or '+?') instead of '=' (or '=?') it is an addition condition. If the condition is true the menu entry will be included in the menu. If the condition is false the menu entry will not be included in the menu.

This is preceded by "Default conditions" (the = condition), which determine which entry will be highlighted as the default choice when the menu appears. Anyway, by way of example:

+ t r & ! t t

t r means if this is a regular file ("t(ype) r"), and ! t t means if the file has not been tagged in the interface.

> ,

On top what has been written above, this page can be browsed in the Internet, when searching for man pages, e.g.: https://www.systutorials.com/docs/linux/man/1-mc/

Search for "Menu File Edit" .

Best regards, Jarek

[Aug 10, 2019] midnight commander - How to configure coloring of the file names in MC - Super User

If colors are crazy, the simplest way to solve this problem is to turn them off
To turn off color you can also use option mc --nocolor or by by using the -b flag
You can customize the color displayed by defining them in ~/.mc/ini . But that requres some work. Have a look here for an example: http://ajnasz.hu/blog/20080101/midnight-commander-coloring .
Aug 10, 2019 | superuser.com
How to configure coloring of the file names in MC? Ask Question Asked 8 years, 7 months ago Active 1 year, 4 months ago Viewed 4k times 8 3

Mike L. ,Jan 9, 2011 at 17:21

Is it possible to configure the Midnight Commander (Ubuntu 10.10) to show certain file and directory names differently, e.g. all hidden (starting with a period) using grey color?

Mike L. ,Feb 20, 2018 at 5:51

Under Options -> Panel Options select File highlight -> File types .

See man mc in the Colors section for ways to choose particular colors by adding entries in your ~/.config/mc/ini file. Unfortunately, there doesn't appear to be a keyword for hidden files.

[Aug 07, 2019] Find files and tar them (with spaces)

Aug 07, 2019 | stackoverflow.com

Ask Question Asked 8 years, 3 months ago Active 1 month ago Viewed 104k times 106 45


porges ,Sep 6, 2012 at 17:43

Alright, so simple problem here. I'm working on a simple back up code. It works fine except if the files have spaces in them. This is how I'm finding files and adding them to a tar archive:
find . -type f | xargs tar -czvf backup.tar.gz

The problem is when the file has a space in the name because tar thinks that it's a folder. Basically is there a way I can add quotes around the results from find? Or a different way to fix this?

Brad Parks ,Mar 2, 2017 at 18:35

Use this:
find . -type f -print0 | tar -czvf backup.tar.gz --null -T -

It will:

Also see:

czubehead ,Mar 19, 2018 at 11:51

There could be another way to achieve what you want. Basically,
  1. Use the find command to output path to whatever files you're looking for. Redirect stdout to a filename of your choosing.
  2. Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)
    find . -name "*.whatever" > yourListOfFiles
    tar -cvf yourfile.tar -T yourListOfFiles
    

gsteff ,May 5, 2011 at 2:05

Try running:
    find . -type f | xargs -d "\n" tar -czvf backup.tar.gz

Caleb Kester ,Oct 12, 2013 at 20:41

Why not:
tar czvf backup.tar.gz *

Sure it's clever to use find and then xargs, but you're doing it the hard way.

Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....

Kalibur x ,May 19, 2016 at 13:54

If you have multiple files or directories and you want to zip them into independent *.gz file you can do this. Optional -type f -atime
find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;

This will compress

httpd-log01.txt
httpd-log02.txt

to

httpd-log01.txt.gz
httpd-log02.txt.gz

Frank Eggink ,Apr 26, 2017 at 8:28

Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`

tommy.carstensen ,Dec 10, 2017 at 14:55

Another solution as seen here :
find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +

Robino ,Sep 22, 2016 at 14:26

The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.

For example this allows using the list to calculate size of the files being archived:

#!/bin/sh

backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""

archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist

#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath

#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
    if [ ! -z "$nextFile" ]; then
        du -sb "$nextFile"
    fi
done | awk '{size+=$1} END {print size}'
`

#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath

user3472383 ,Jun 27 at 1:11

Would add a comment to @Steve Kehlet post but need 50 rep (RIP).

For anyone that has found this post through numerous googling, I found a way to not only find specific files given a time range, but also NOT include the relative paths OR whitespaces that would cause tarring errors. (THANK YOU SO MUCH STEVE.)

find . -name "*.pdf" -type f -mtime 0 -printf "%f\0" | tar -czvf /dir/zip.tar.gz --null -T -
  1. . relative directory
  2. -name "*.pdf" look for pdfs (or any file type)
  3. -type f type to look for is a file
  4. -mtime 0 look for files created in last 24 hours
  5. -printf "%f\0" Regular -print0 OR -printf "%f" did NOT work for me. From man pages:

This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for -ls and -fls. If you are able to decide what format to use for the output of find then it is normally better to use '\0' as a terminator than to use newline, as file names can contain white space and newline characters.

  1. -czvf create archive, filter the archive through gzip , verbosely list files processed, archive name

[Aug 06, 2019] Tar archiving that takes input from a list of files>

Aug 06, 2019 | stackoverflow.com

Tar archiving that takes input from a list of files Ask Question Asked 7 years, 9 months ago Active 6 months ago Viewed 123k times 131 29


Kurt McKee ,Apr 29 at 10:22

I have a file that contain list of files I want to archive with tar. Let's call it mylist.txt

It contains:

file1.txt
file2.txt
...
file10.txt

Is there a way I can issue TAR command that takes mylist.txt as input? Something like

tar -cvf allfiles.tar -[someoption?] mylist.txt

So that it is similar as if I issue this command:

tar -cvf allfiles.tar file1.txt file2.txt file10.txt

Stphane ,May 25 at 0:11

Yes:
tar -cvf allfiles.tar -T mylist.txt

drue ,Jun 23, 2014 at 14:56

Assuming GNU tar (as this is Linux), the -T or --files-from option is what you want.

Stphane ,Mar 1, 2016 at 20:28

You can also pipe in the file names which might be useful:
find /path/to/files -name \*.txt | tar -cvf allfiles.tar -T -

David C. Rankin ,May 31, 2018 at 18:27

Some versions of tar, for example, the default versions on HP-UX (I tested 11.11 and 11.31), do not include a command line option to specify a file list, so a decent work-around is to do this:
tar cvf allfiles.tar $(cat mylist.txt)

Jan ,Sep 25, 2015 at 20:18

On Solaris, you can use the option -I to read the filenames that you would normally state on the command line from a file. In contrast to the command line, this can create tar archives with hundreds of thousands of files (just did that).

So the example would read

tar -cvf allfiles.tar -I mylist.txt

,

For me on AIX, it worked as follows:
tar -L List.txt -cvf BKP.tar

[Aug 06, 2019] Shell command to tar directory excluding certain files-folders

Aug 06, 2019 | stackoverflow.com

Shell command to tar directory excluding certain files/folders Ask Question Asked 10 years, 1 month ago Active 1 month ago Viewed 787k times 720 186


Rekhyt ,Jun 24, 2014 at 16:06

Is there a simple shell command/script that supports excluding certain files/folders from being archived?

I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.

Not quite solutions:

The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.

I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.

I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.

Can anybody think of a better/more efficient solution?

EDIT: Charles Ma 's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):

cd /folder_to_backup
tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

James O'Brien ,Nov 24, 2016 at 9:55

You can have multiple exclude options for tar so
$ tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

etc will work. Make sure to put --exclude before the source and destination items.

Johan Soderberg ,Jun 11, 2009 at 23:10

You can exclude directories with --exclude for tar.

If you want to archive everything except /usr you can use:

tar -zcvf /all.tgz / --exclude=/usr

In your case perhaps something like

tar -zcvf archive.tgz arc_dir --exclude=dir/ignore_this_dir

cstamas ,Oct 8, 2018 at 18:02

Possible options to exclude files/directories from backup using tar:

Exclude files using multiple patterns

tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup

Exclude files using an exclude file filled with a list of patterns

tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup

Exclude files using tags by placing a tag file in any directory that should be skipped

tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup

Anish Ramaswamy ,Apr 1 at 16:18

old question with many answers, but I found that none were quite clear enough for me, so I would like to add my try.

if you have the following structure

/home/ftp/mysite/

with following file/folders

/home/ftp/mysite/file1
/home/ftp/mysite/file2
/home/ftp/mysite/file3
/home/ftp/mysite/folder1
/home/ftp/mysite/folder2
/home/ftp/mysite/folder3

so, you want to make a tar file that contain everyting inside /home/ftp/mysite (to move the site to a new server), but file3 is just junk, and everything in folder3 is also not needed, so we will skip those two.

we use the format

tar -czvf <name of tar file> <what to tar> <any excludes>

where the c = create, z = zip, and v = verbose (you can see the files as they are entered, usefull to make sure none of the files you exclude are being added). and f= file.

so, my command would look like this

cd /home/ftp/
tar -czvf mysite.tar.gz mysite --exclude='file3' --exclude='folder3'

note the files/folders excluded are relatively to the root of your tar (I have tried full path here relative to / but I can not make that work).

hope this will help someone (and me next time I google it)

not2qubit ,Apr 4, 2018 at 3:24

You can use standard "ant notation" to exclude directories relative.
This works for me and excludes any .git or node_module directories.
tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/*  -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt

myInputFile.txt Contains:

/dev2/java
/dev2/javascript

GeertVc ,Feb 9, 2015 at 13:37

I've experienced that, at least with the Cygwin version of tar I'm using ("CYGWIN_NT-5.1 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin" on a Windows XP Home Edition SP3 machine), the order of options is important.

While this construction worked for me:

tar cfvz target.tgz --exclude='<dir1>' --exclude='<dir2>' target_dir

that one didn't work:

tar cfvz --exclude='<dir1>' --exclude='<dir2>' target.tgz target_dir

This, while tar --help reveals the following:

tar [OPTION...] [FILE]

So, the second command should also work, but apparently it doesn't seem to be the case...

Best rgds,

Scott Stensland ,Feb 12, 2015 at 20:55

This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules
tar --exclude={*.png,*.mp3,*.wav,.git,node_modules} -Jcf ${target_tarball}  ${source_dirname}

Michael ,May 18 at 23:29

I found this somewhere else so I won't take credit, but it worked better than any of the solutions above for my mac specific issues (even though this is closed):
tar zc --exclude __MACOSX --exclude .DS_Store -f <archive> <source(s)>

J. Lawson ,Apr 17, 2018 at 23:28

For those who have issues with it, some versions of tar would only work properly without the './' in the exclude value.
Tar --version

tar (GNU tar) 1.27.1

Command syntax that work:

tar -czvf ../allfiles-butsome.tar.gz * --exclude=acme/foo

These will not work:

$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=./acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='./acme/foo'
$ tar --exclude=./acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='./acme/foo' -czvf ../allfiles-butsome.tar.gz *
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=/full/path/acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='/full/path/acme/foo'
$ tar --exclude=/full/path/acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='/full/path/acme/foo' -czvf ../allfiles-butsome.tar.gz *

Jerinaw ,May 6, 2017 at 20:07

For Mac OSX I had to do

tar -zcv --exclude='folder' -f theOutputTarFile.tar folderToTar

Note the -f after the --exclude=

Aaron Votre ,Jul 15, 2016 at 15:56

I agree the --exclude flag is the right approach.
$ tar --exclude='./folder_or_file' --exclude='file_pattern' --exclude='fileA'

A word of warning for a side effect that I did not find immediately obvious: The exclusion of 'fileA' in this example will search for 'fileA' RECURSIVELY!

Example:A directory with a single subdirectory containing a file of the same name (data.txt)

data.txt
config.txt
--+dirA
  |  data.txt
  |  config.docx

Znik ,Nov 15, 2014 at 5:12

To avoid possible 'xargs: Argument list too long' errors due to the use of find ... | xargs ... when processing tens of thousands of files, you can pipe the output of find directly to tar using find ... -print0 | tar --null ... .
# archive a given directory, but exclude various files & directories 
# specified by their full file paths
find "$(pwd -P)" -type d \( -path '/path/to/dir1' -or -path '/path/to/dir2' \) -prune \
   -or -not \( -path '/path/to/file1' -or -path '/path/to/file2' \) -print0 | 
   gnutar --null --no-recursion -czf archive.tar.gz --files-from -
   #bsdtar --null -n -czf archive.tar.gz -T -

Mike ,May 9, 2014 at 21:29

After reading this thread, I did a little testing on RHEL 5 and here are my results for tarring up the abc directory:

This will exclude the directories error and logs and all files under the directories:

tar cvpzf abc.tgz abc/ --exclude='abc/error' --exclude='abc/logs'

Adding a wildcard after the excluded directory will exclude the files but preserve the directories:

tar cvpzf abc.tgz abc/ --exclude='abc/error/*' --exclude='abc/logs/*'

Alex B ,Jun 11, 2009 at 23:03

Use the find command in conjunction with the tar append (-r) option. This way you can add files to an existing tar in a single step, instead of a two pass solution (create list of files, create tar).
find /dir/dir -prune ... -o etc etc.... -exec tar rvf ~/tarfile.tar {} \;

frommelmak ,Sep 10, 2012 at 14:08

You can also use one of the "--exclude-tag" options depending on your needs:

The folder hosting the specified FILE will be excluded.

camh ,Jun 12, 2009 at 5:53

You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file:
find ... | cpio -o -H ustar | gzip -c > archive.tar.gz

PicoutputCls ,Aug 21, 2018 at 14:13

gnu tar v 1.26 the --exclude needs to come after archive file and backup directory arguments, should have no leading or trailing slashes, and prefers no quotes (single or double). So relative to the PARENT directory to be backed up, it's:

tar cvfz /path_to/mytar.tgz ./dir_to_backup --exclude=some_path/to_exclude

user2553863 ,May 28 at 21:41

After reading all this good answers for different versions and having solved the problem for myself, I think there are very small details that are very important, and rare to GNU/Linux general use , that aren't stressed enough and deserves more than comments.

So I'm not going to try to answer the question for every case, but instead, try to register where to look when things doesn't work.

IT IS VERY IMPORTANT TO NOTICE:

  1. THE ORDER OF THE OPTIONS MATTER: it is not the same put the --exclude before than after the file option and directories to backup. This is unexpected at least to me, because in my experience, in GNU/Linux commands, usually the order of the options doesn't matter.
  2. Different tar versions expects this options in different order: for instance, @Andrew's answer indicates that in GNU tar v 1.26 and 1.28 the excludes comes last, whereas in my case, with GNU tar 1.29, it's the other way.
  3. THE TRAILING SLASHES MATTER : at least in GNU tar 1.29, it shouldn't be any .

In my case, for GNU tar 1.29 on Debian stretch, the command that worked was

tar --exclude="/home/user/.config/chromium" --exclude="/home/user/.cache" -cf file.tar  /dir1/ /home/ /dir3/

The quotes didn't matter, it worked with or without them.

I hope this will be useful to someone.

jørgensen ,Dec 19, 2015 at 11:10

Your best bet is to use find with tar, via xargs (to handle the large number of arguments). For example:
find / -print0 | xargs -0 tar cjf tarfile.tar.bz2

Ashwini Gupta ,Jan 12, 2018 at 10:30

tar -cvzf destination_folder source_folder -X /home/folder/excludes.txt

-X indicates a file which contains a list of filenames which must be excluded from the backup. For Instance, you can specify *~ in this file to not include any filenames ending with ~ in the backup.

George ,Sep 4, 2013 at 22:35

Possible redundant answer but since I found it useful, here it is:

While a FreeBSD root (i.e. using csh) I wanted to copy my whole root filesystem to /mnt but without /usr and (obviously) /mnt. This is what worked (I am at /):

tar --exclude ./usr --exclude ./mnt --create --file - . (cd /mnt && tar xvd -)

My whole point is that it was necessary (by putting the ./ ) to specify to tar that the excluded directories where part of the greater directory being copied.

My €0.02

t0r0X ,Sep 29, 2014 at 20:25

I had no luck getting tar to exclude a 5 Gigabyte subdirectory a few levels deep. In the end, I just used the unix Zip command. It worked a lot easier for me.

So for this particular example from the original post
(tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . )

The equivalent would be:

zip -r /backup/filename.zip . -x upload/folder/**\* upload/folder2/**\*

(NOTE: Here is the post I originally used that helped me https://superuser.com/questions/312301/unix-zip-directory-but-excluded-specific-subdirectories-and-everything-within-t )

RohitPorwal ,Jul 21, 2016 at 9:56

Check it out
tar cvpzf zip_folder.tgz . --exclude=./public --exclude=./tmp --exclude=./log --exclude=fileName

tripleee ,Sep 14, 2017 at 4:38

The following bash script should do the trick. It uses the answer given here by Marcus Sundman.
#!/bin/bash

echo -n "Please enter the name of the tar file you wish to create with out extension "
read nam

echo -n "Please enter the path to the directories to tar "
read pathin

echo tar -czvf $nam.tar.gz
excludes=`find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs`
echo $pathin

echo tar -czvf $nam.tar.gz $excludes $pathin

This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line.

Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work.

EDIT

Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.

[Aug 06, 2019] bash - More efficient way to find tar millions of files - Stack Overflow

Aug 06, 2019 | stackoverflow.com

More efficient way to find & tar millions of files Ask Question Asked 9 years, 3 months ago Active 8 months ago Viewed 25k times 22 13


theomega ,Apr 29, 2010 at 13:51

I've got a job running on my server at the command line prompt for a two days now:
find data/ -name filepattern-*2009* -exec tar uf 2009.tar {} ;

It is taking forever , and then some. Yes, there are millions of files in the target directory. (Each file is a measly 8 bytes in a well hashed directory structure.) But just running...

find data/ -name filepattern-*2009* -print > filesOfInterest.txt

...takes only two hours or so. At the rate my job is running, it won't be finished for a couple of weeks .. That seems unreasonable. Is there a more efficient to do this? Maybe with a more complicated bash script?

A secondary questions is "why is my current approach so slow?"

Stu Thompson ,May 6, 2013 at 1:11

If you already did the second command that created the file list, just use the -T option to tell tar to read the files names from that saved file list. Running 1 tar command vs N tar commands will be a lot better.

Matthew Mott ,Jul 3, 2014 at 19:21

One option is to use cpio to generate a tar-format archive:
$ find data/ -name "filepattern-*2009*" | cpio -ov --format=ustar > 2009.tar

cpio works natively with a list of filenames from stdin, rather than a top-level directory, which makes it an ideal tool for this situation.

bashfu ,Apr 23, 2010 at 10:05

Here's a find-tar combination that can do what you want without the use of xargs or exec (which should result in a noticeable speed-up):
tar --version    # tar (GNU tar) 1.14 

# FreeBSD find (on Mac OS X)
find -x data -name "filepattern-*2009*" -print0 | tar --null --no-recursion -uf 2009.tar --files-from -

# for GNU find use -xdev instead of -x
gfind data -xdev -name "filepattern-*2009*" -print0 | tar --null --no-recursion -uf 2009.tar --files-from -

# added: set permissions via tar
find -x data -name "filepattern-*2009*" -print0 | \
    tar --null --no-recursion --owner=... --group=... --mode=... -uf 2009.tar --files-from -

Stu Thompson ,Apr 28, 2010 at 12:50

There is xargs for this:
find data/ -name filepattern-*2009* -print0 | xargs -0 tar uf 2009.tar

Guessing why it is slow is hard as there is not much information. What is the structure of the directory, what filesystem do you use, how it was configured on creating. Having milions of files in single directory is quite hard situation for most filesystems.

bashfu ,May 1, 2010 at 14:18

To correctly handle file names with weird (but legal) characters (such as newlines, ...) you should write your file list to filesOfInterest.txt using find's -print0:
find -x data -name "filepattern-*2009*" -print0 > filesOfInterest.txt
tar --null --no-recursion -uf 2009.tar --files-from filesOfInterest.txt

Michael Aaron Safyan ,Apr 23, 2010 at 8:47

The way you currently have things, you are invoking the tar command every single time it finds a file, which is not surprisingly slow. Instead of taking the two hours to print plus the amount of time it takes to open the tar archive, see if the files are out of date, and add them to the archive, you are actually multiplying those times together. You might have better success invoking the tar command once, after you have batched together all the names, possibly using xargs to achieve the invocation. By the way, I hope you are using 'filepattern-*2009*' and not filepattern-*2009* as the stars will be expanded by the shell without quotes.

ruffrey ,Nov 20, 2018 at 17:13

There is a utility for this called tarsplitter .
tarsplitter -m archive -i folder/*.json -o archive.tar -p 8

will use 8 threads to archive the files matching "folder/*.json" into an output archive of "archive.tar"

https://github.com/AQUAOSOTech/tarsplitter

syneticon-dj ,Jul 22, 2013 at 8:47

Simplest (also remove file after archive creation):
find *.1  -exec tar czf '{}.tgz' '{}' --remove-files \;

[Aug 06, 2019] backup - Fastest way combine many files into one (tar czf is too slow) - Unix Linux Stack Exchange

Aug 06, 2019 | unix.stackexchange.com

Fastest way combine many files into one (tar czf is too slow) Ask Question Asked 7 years, 11 months ago Active 21 days ago Viewed 32k times 22 5


Gilles ,Nov 5, 2013 at 0:05

Currently I'm running tar czf to combine backup files. The files are in a specific directory.

But the number of files is growing. Using tzr czf takes too much time (more than 20 minutes and counting).

I need to combine the files more quickly and in a scalable fashion.

I've found genisoimage , readom and mkisofs . But I don't know which is fastest and what the limitations are for each of them.

Rufo El Magufo ,Aug 24, 2017 at 7:56

You should check if most of your time are being spent on CPU or in I/O. Either way, there are ways to improve it:

A: don't compress

You didn't mention "compression" in your list of requirements so try dropping the "z" from your arguments list: tar cf . This might be speed up things a bit.

There are other techniques to speed-up the process, like using "-N " to skip files you already backed up before.

B: backup the whole partition with dd

Alternatively, if you're backing up an entire partition, take a copy of the whole disk image instead. This would save processing and a lot of disk head seek time. tar and any other program working at a higher level have a overhead of having to read and process directory entries and inodes to find where the file content is and to do more head disk seeks , reading each file from a different place from the disk.

To backup the underlying data much faster, use:

dd bs=16M if=/dev/sda1 of=/another/filesystem

(This assumes you're not using RAID, which may change things a bit)

,

To repeat what others have said: we need to know more about the files that are being backed up. I'll go with some assumptions here. Append to the tar file

If files are only being added to the directories (that is, no file is being deleted), make sure you are appending to the existing tar file rather than re-creating it every time. You can do this by specifying the existing archive filename in your tar command instead of a new one (or deleting the old one).

Write to a different disk

Reading from the same disk you are writing to may be killing performance. Try writing to a different disk to spread the I/O load. If the archive file needs to be on the same disk as the original files, move it afterwards.

Don't compress

Just repeating what @Yves said. If your backup files are already compressed, there's not much need to compress again. You'll just be wasting CPU cycles.

[Aug 02, 2019] linux - How to tar directory and then remove originals including the directory - Super User

Aug 02, 2019 | superuser.com

How to tar directory and then remove originals including the directory? Ask Question Asked 9 years, 6 months ago Active 4 years, 6 months ago Viewed 124k times 28 7


mit ,Dec 7, 2016 at 1:22

I'm trying to tar a collection of files in a directory called 'my_directory' and remove the originals by using the command:
tar -cvf files.tar my_directory --remove-files

However it is only removing the individual files inside the directory and not the directory itself (which is what I specified in the command). What am I missing here?

EDIT:

Yes, I suppose the 'remove-files' option is fairly literal. Although I too found the man page unclear on that point. (In linux I tend not to really distinguish much between directories and files that much, and forget sometimes that they are not the same thing). It looks like the consensus is that it doesn't remove directories.

However, my major prompting point for asking this question stems from tar's handling of absolute paths. Because you must specify a relative path to a file/s to be compressed, you therefore must change to the parent directory to tar it properly. As I see it using any kind of follow-on 'rm' command is potentially dangerous in that situation. Thus I was hoping to simplify things by making tar itself do the remove.

For example, imagine a backup script where the directory to backup (ie. tar) is included as a shell variable. If that shell variable value was badly entered, it is possible that the result could be deleted files from whatever directory you happened to be in last.

Arjan ,Feb 13, 2016 at 13:08

You are missing the part which says the --remove-files option removes files after adding them to the archive.

You could follow the archive and file-removal operation with a command like,

find /path/to/be/archived/ -depth -type d -empty -exec rmdir {} \;


Update: You may be interested in reading this short Debian discussion on,
Bug 424692: --remove-files complains that directories "changed as we read it" .

Kim ,Feb 13, 2016 at 13:08

Since the --remove-files option only removes files , you could try
tar -cvf files.tar my_directory && rm -R my_directory

so that the directory is removed only if the tar returns an exit status of 0

redburn ,Feb 13, 2016 at 13:08

Have you tried to put --remove-files directive after archive name? It works for me.
tar -cvf files.tar --remove-files my_directory

shellking ,Oct 4, 2010 at 19:58

source={directory argument}

e.g.

source={FULL ABSOLUTE PATH}/my_directory
parent={parent directory of argument}

e.g.

parent={ABSOLUTE PATH of 'my_directory'/
logFile={path to a run log that captures status messages}

Then you could execute something along the lines of:

cd ${parent}

tar cvf Tar_File.`date%Y%M%D_%H%M%S` ${source}

if [ $? != 0 ]

then

 echo "Backup FAILED for ${source} at `date` >> ${logFile}

else

 echo "Backup SUCCESS for ${source} at `date` >> ${logFile}

 rm -rf ${source}

fi

mit ,Nov 14, 2011 at 13:21

This was probably a bug.

Also the word "file" is ambigous in this case. But because this is a command line switch I would it expect to mean also directories, because in unix/lnux everything is a file, also a directory. (The other interpretation is of course also valid, but It makes no sense to keep directories in such a case. I would consider it unexpected and confusing behavior.)

But I have found that in gnu tar on some distributions gnu tar actually removes the directory tree. Another indication that keeping the tree was a bug. Or at least some workaround until they fixed it.

This is what I tried out on an ubuntu 10.04 console:

mit:/var/tmp$ mkdir tree1                                                                                               
mit:/var/tmp$ mkdir tree1/sub1                                                                                          
mit:/var/tmp$ > tree1/sub1/file1                                                                                        

mit:/var/tmp$ ls -la                                                                                                    
drwxrwxrwt  4 root root 4096 2011-11-14 15:40 .                                                                              
drwxr-xr-x 16 root root 4096 2011-02-25 03:15 ..
drwxr-xr-x  3 mit  mit  4096 2011-11-14 15:40 tree1

mit:/var/tmp$ tar -czf tree1.tar.gz tree1/ --remove-files

# AS YOU CAN SEE THE TREE IS GONE NOW:

mit:/var/tmp$ ls -la
drwxrwxrwt  3 root root 4096 2011-11-14 15:41 .
drwxr-xr-x 16 root root 4096 2011-02-25 03:15 ..
-rw-r--r--  1 mit   mit    159 2011-11-14 15:41 tree1.tar.gz                                                                   


mit:/var/tmp$ tar --version                                                                                             
tar (GNU tar) 1.22                                                                                                           
Copyright © 2009 Free Software Foundation, Inc.

If you want to see it on your machine, paste this into a console at your own risk:

tar --version                                                                                             
cd /var/tmp
mkdir -p tree1/sub1                                                                                          
> tree1/sub1/file1                                                                                        
tar -czf tree1.tar.gz tree1/ --remove-files
ls -la

[Jul 31, 2019] Is Ruby moving toward extinction?

Jul 31, 2019 | developers.slashdot.org

timeOday ( 582209 ) , Monday July 29, 2019 @03:44PM ( #59007686 )

Re:ORLY ( Score: 5 , Insightful)

This is what it feels like to actually learn from an article instead of simply having it confirm your existing beliefs.

Here is what it says:

An analysis of Dice job-posting data over the past year shows a startling dip in the number of companies looking for technology professionals who are skilled in Ruby. In 2018, the number of Ruby jobs declined 56 percent. That's a huge warning sign that companies are turning away from Ruby - and if that's the case, the language's user-base could rapidly erode to almost nothing.

Well, what's your evidence-based rebuttal to that?

Wdomburg ( 141264 ) writes:
Re: ( Score: 2 )

If you actually look at the TIOBE rankings, it's #11 (not #12 as claimed in the article), and back on the upswing. If you look at RedMonk, which they say they looked at but don't reference with respect to Ruby, it is a respectable #8, being one of the top languages on GitHub and Stack Overflow.

We are certainly past the glory days of Ruby, when it was the Hot New Thing and everyone was deploying Rails, but to suggest that it is "probably doomed" seems a somewhat hysterical prediction.

OrangeTide ( 124937 ) , Tuesday July 30, 2019 @01:52AM ( #59010348 ) Homepage Journal
Re:ORLY ( Score: 4 , Funny)
How do they know how many Ruby jobs there are? Maybe how many Ruby job openings announced, but not the actual number of jobs. Or maybe they are finding Ruby job-applicants and openings via other means.

Maybe there is a secret list of Ruby job postings only available to the coolest programmers? Man! I never get to hang out with the cool kids.

jellomizer ( 103300 ) , Monday July 29, 2019 @03:48PM ( #59007714 )
Re:ORLY ( Score: 5 , Insightful)

Perhaps the devops/web programmers is a dying field.

But to be fair, Ruby had its peak about 10 years ago. With Ruby on Rails. However the problem is the "Rails" started to get very dated. And Python and Node.JS had taken its place.

whitelabrat ( 469237 ) , Monday July 29, 2019 @03:57PM ( #59007778 )
Re:ORLY ( Score: 5 , Insightful)

I don't see Ruby dying anytime soon, but I do get the feeling that Python is the go-to scripting language for all the things now. I learned Ruby and wish I spent that time learning Python.

Perl is perl. It will live on, but anybody writing new things with it probably needs to have a talkin' to.

phantomfive ( 622387 ) , Monday July 29, 2019 @07:32PM ( #59009188 ) Journal
Re:ORLY ( Score: 4 , Insightful)
I learned Ruby and wish I spent that time learning Python.

Ruby and Python are basically the same thing. With a little google, you can literally start programming in Python today. Search for "print python" and you can easily write a hello world. search for 'python for loop' and suddenly you can do repetitious tasks. Search for "define function python" and you can organize your code.

After that do a search for hash tables and lists in Python and you'll be good enough to pass a coding interview in the language.

[Jul 31, 2019] 5 Programming Languages That Are Probably Doomed

The article is a clickbait. entrenched languages seldom die. But some Slashdot comments are interesting.
Jul 31, 2019 | developers.slashdot.org

NoNonAlphaCharsHere ( 2201864 ) , Monday July 29, 2019 @03:39PM ( #59007638 )

Re:ORLY ( Score: 5 , Funny)

Perl has been "doomed" for over 30 years now, hasn't stopped it.

geekoid ( 135745 ) writes:
Re: ( Score: 2 )

OTOH, it not exactly what it once was.

IMO: if you can't write good readable code in PERL, you should find a new business to work in.

Anonymous Coward writes:
check the job description ( Score: 3 , Funny)

Writing unreadable perl is the business.

ShanghaiBill ( 739463 ) writes:
Re: ( Score: 3 )
Perl has been "doomed" for over 30 years now, hasn't stopped it.

I love Perl, but today it is mostly small throw-away scripts and maintaining legacy apps.

It makes little sense to use Perl for a new project.

Perl won't disappear, but the glory days are in the past.

Anonymous Coward , Monday July 29, 2019 @03:59PM ( #59007794 )
Re:ORLY ( Score: 4 , Interesting)

I write new code in perl all the time. Cleanly written, well formatted and completely maintainable. Simply because YOU can't write perl in such a manner, that doesn't mean others can't.

Anonymous Coward writes:
Re: ORLY ( Score: 2 , Insightful)

Do you have someone else who is saying that about your code or is that your own opinion?

Sarten-X ( 1102295 ) , Monday July 29, 2019 @05:53PM ( #59008624 ) Homepage
Re: ORLY ( Score: 4 , Insightful)

I happen to read a lot of Perl in my day job, involving reverse-engineering a particular Linux-based appliance for integration purposes. I seldom come across scripts that are too actually bad.

It's important to understand that Perl has a different concept of readability. It's more like reading a book than reading a program, because there are so many ways to write any given task. A good Perl programmer will incorporate that flexibility into their style, so intent can be inferred not just from the commands used, but also how the code is arranged. For example, a large block describing a complex function would be written verbosely for detailed clarity.

A trivial statement could be used, if it resolves an edge case.

Conversely, a good Perl reader will be familiar enough with the language to understand the idioms and shorthand used, so they can understand the story as written without being distracted by the ugly bits. Once viewed from that perspective, a Perl program can condense incredible amounts of description into just a few lines, and still be as readily-understood as any decent novel.

Sarten-X ( 1102295 ) writes: on Monday July 29, 2019 @07:06PM ( #59009056 ) Homepage
Re: ORLY ( Score: 4 , Insightful)

Since you brought it up...

In building several dev teams, I have never tried to hire everyone with any particular skill. I aim to have at least two people with each skill, but won't put effort to having more than that at first. After the initial startup of the team, I try to run projects in pairs, with an expert starting the project, then handing it to a junior (in that particular skill) for completion. After a few rounds of that, the junior is close enough to an expert, and somebody else takes the junior role. That way, even with turnover, expertise is shared among the team, and there's always someone who can be the expert.

Back to the subject at hand, though...

My point is that Perl is a more conversational language that others, and its structure reflects that. It is unreasonable to simply look at Perl code, see the variety of structures, and declare it "unreadable" simply because the reader doesn't understand the language.

As an analogy, consider the structural differences between Lord of the Rings and The Cat in the Hat . A reader who is only used to The Cat in the Hat would find Lord of the Rings to be ridiculously complex to the point of being unreadable, when Lord of the Rings is simply making use of structures and capabilities that are not permitted in the language of young children's' books.

This is not to say that other languages are wrong to have a more limited grammar. They are simply different, and learning to read a more flexible language is a skill to be developed like any other. Similar effort must be spent to learn other languages with sufficiently-different structure, like Lisp or Haskell.

phantomfive ( 622387 ) , Monday July 29, 2019 @07:24PM ( #59009128 ) Journal
Re:ORLY ( Score: 3 )

FWIW DuckDuckGo is apparently written primarily in Perl.

fahrbot-bot ( 874524 ) , Monday July 29, 2019 @03:46PM ( #59007696 )
If your career is based on ... ( Score: 3 , Interesting)

From TFA:

Perl: Even if RedMonk has Perl's popularity declining, it's still going to take a long time for the language to flatten out completely, given the sheer number of legacy websites that still feature its code. Nonetheless, a lack of active development, and widespread developer embrace of other languages for things like building websites, means that Perl is going to just fall into increasing disuse.

First, Perl is used for many, many more things than websites -- and the focus in TFA is short-sighted. Second, I've written a LOT of Perl in my many years, but wouldn't say my (or most people's) career is based on it. Yes, I have written applications in Perl, but more often used it for utility, glue and other things that help get things done, monitor and (re)process data. Nothing (or very few things) can beat Perl for a quick knock-off script to do something or another.

Perl's not going anywhere and it will be a useful language to know for quite a while. Languages like Perl (and Python) are great tools to have in your toolbox, ones that you know how to wield well when you need them. Knowing when you need them, and not something else, is important.

TimHunter ( 174406 ) , Monday July 29, 2019 @05:22PM ( #59008400 )
Career based on *a* programming language? ( Score: 4 , Insightful)

Anybody whose career is based on a single programming language is doomed already. Programmers know how to write code. The language they use is beside the point. A good programmer can write code in whatever language is asked of them.

bobbied ( 2522392 ) , Monday July 29, 2019 @04:23PM ( #59007966 )
Re:Diversifying ( Score: 5 , Insightful)
The writer of this article should consider diversifying his skillset at some point, as not all bloggers endure forever and his popularity ranking on Slashdot has recently tanked.

I'd suggest that this writer quit their day job and take up stand up...

Old languages never really die until the platform dies. Languages may fall out of favor, but they don't usually die until the platform they are running on disappears and then the people who used them die. So, FORTRAN, C, C++, and COBOL and more are here to pretty much stay.

Specifically, PERL isn't going anywhere being fundamentally on Linux, neither is Ruby, the rest to varying degrees have been out of favor for awhile now, but none of the languages in the article are dead. They are, however, falling out of favor and because of that, it might be a good idea to be adding other tools to your programmer's tool box if your livelihood depends on one of them.

[Jul 30, 2019] Python is overrated

Notable quotes:
"... R commits a substantial scale crime by being so dependent on memory-resident objects. Python commits major scale crime with its single-threaded primary interpreter loop. ..."
Jul 29, 2019 | developers.slashdot.org

epine ( 68316 ), Monday July 29, 2019 @05:48PM ( #59008600 ) Score: 3 )

Jul 30, 2019 | developers.slashdot.org

I had this naive idea that Python might substantially displace R until I learned more about the Python internals, which are pretty nasty. This is the new generation's big data language? If true, sure sucks to be young again.

Python isn't even really used to do big data. It's mainly used to orchestrate big data flows on top of other libraries or facilities. It has more or less become the lingua franca of high-level hand waving. Any real grunt is far below.

R commits a substantial scale crime by being so dependent on memory-resident objects. Python commits major scale crime with its single-threaded primary interpreter loop.

If I move away from R, it will definitely be Julia for any real work (as Julia matures, if it matures well), and not Python.

[Jul 30, 2019] The difference between tar and tar.gz archives

With tar.gz to extract a file archiver first creates an intermediary tarball x.tar file from x.tar.gz by uncompressing the whole archive then unpack requested files from this intermediary tarball. In tar.gz archive is large unpacking can take several hours or even days.
Jul 30, 2019 | askubuntu.com

[Jul 29, 2019] How do I tar a directory of files and folders without including the directory itself - Stack Overflow

Jan 05, 2017 | stackoverflow.com

How do I tar a directory of files and folders without including the directory itself? Ask Question Asked 10 years, 1 month ago Active 8 months ago Viewed 464k times 348 105


tvanfosson ,Jan 5, 2017 at 12:29

I typically do:
tar -czvf my_directory.tar.gz my_directory

What if I just want to include everything (including any hidden system files) in my_directory, but not the directory itself? I don't want:

my_directory
   --- my_file
   --- my_file
   --- my_file

I want:

my_file
my_file
my_file

PanCrit ,Feb 19 at 13:04

cd my_directory/ && tar -zcvf ../my_dir.tgz . && cd -

should do the job in one line. It works well for hidden files as well. "*" doesn't expand hidden files by path name expansion at least in bash. Below is my experiment:

$ mkdir my_directory
$ touch my_directory/file1
$ touch my_directory/file2
$ touch my_directory/.hiddenfile1
$ touch my_directory/.hiddenfile2
$ cd my_directory/ && tar -zcvf ../my_dir.tgz . && cd ..
./
./file1
./file2
./.hiddenfile1
./.hiddenfile2
$ tar ztf my_dir.tgz
./
./file1
./file2
./.hiddenfile1
./.hiddenfile2

JCotton ,Mar 3, 2015 at 2:46

Use the -C switch of tar:
tar -czvf my_directory.tar.gz -C my_directory .

The -C my_directory tells tar to change the current directory to my_directory , and then . means "add the entire current directory" (including hidden files and sub-directories).

Make sure you do -C my_directory before you do . or else you'll get the files in the current directory.

Digger ,Mar 23 at 6:52

You can also create archive as usual and extract it with:
tar --strip-components 1 -xvf my_directory.tar.gz

jwg ,Mar 8, 2017 at 12:56

Have a look at --transform / --xform , it gives you the opportunity to massage the file name as the file is added to the archive:
% mkdir my_directory
% touch my_directory/file1
% touch my_directory/file2
% touch my_directory/.hiddenfile1
% touch my_directory/.hiddenfile2
% tar -v -c -f my_dir.tgz --xform='s,my_directory/,,' $(find my_directory -type f)
my_directory/file2
my_directory/.hiddenfile1
my_directory/.hiddenfile2
my_directory/file1
% tar -t -f my_dir.tgz 
file2
.hiddenfile1
.hiddenfile2
file1

Transform expression is similar to that of sed , and we can use separators other than / ( , in the above example).
https://www.gnu.org/software/tar/manual/html_section/tar_52.html

Alex ,Mar 31, 2017 at 15:40

TL;DR
find /my/dir/ -printf "%P\n" | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -

With some conditions (archive only files, dirs and symlinks):

find /my/dir/ -printf "%P\n" -type f -o -type l -o -type d | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -
Explanation

The below unfortunately includes a parent directory ./ in the archive:

tar -czf mydir.tgz -C /my/dir .

You can move all the files out of that directory by using the --transform configuration option, but that doesn't get rid of the . directory itself. It becomes increasingly difficult to tame the command.

You could use $(find ...) to add a file list to the command (like in magnus' answer ), but that potentially causes a "file list too long" error. The best way is to combine it with tar's -T option, like this:

find /my/dir/ -printf "%P\n" -type f -o -type l -o -type d | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -

Basically what it does is list all files ( -type f ), links ( -type l ) and subdirectories ( -type d ) under your directory, make all filenames relative using -printf "%P\n" , and then pass that to the tar command (it takes filenames from STDIN using -T - ). The -C option is needed so tar knows where the files with relative names are located. The --no-recursion flag is so that tar doesn't recurse into folders it is told to archive (causing duplicate files).

If you need to do something special with filenames (filtering, following symlinks etc), the find command is pretty powerful, and you can test it by just removing the tar part of the above command:

$ find /my/dir/ -printf "%P\n" -type f -o -type l -o -type d
> textfile.txt
> documentation.pdf
> subfolder2
> subfolder
> subfolder/.gitignore

For example if you want to filter PDF files, add ! -name '*.pdf'

$ find /my/dir/ -printf "%P\n" -type f ! -name '*.pdf' -o -type l -o -type d
> textfile.txt
> subfolder2
> subfolder
> subfolder/.gitignore
Non-GNU find

The command uses printf (available in GNU find ) which tells find to print its results with relative paths. However, if you don't have GNU find , this works to make the paths relative (removes parents with sed ):

find /my/dir/ -type f -o -type l -o -type d | sed s,^/my/dir/,, | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -

BrainStone ,Dec 21, 2016 at 22:14

This Answer should work in most situations. Notice however how the filenames are stored in the tar file as, for example, ./file1 rather than just file1 . I found that this caused problems when using this method to manipulate tarballs used as package files in BuildRoot .

One solution is to use some Bash globs to list all files except for .. like this:

tar -C my_dir -zcvf my_dir.tar.gz .[^.]* ..?* *

This is a trick I learnt from this answer .

Now tar will return an error if there are no files matching ..?* or .[^.]* , but it will still work. If the error is a problem (you are checking for success in a script), this works:

shopt -s nullglob
tar -C my_dir -zcvf my_dir.tar.gz .[^.]* ..?* *
shopt -u nullglob

Though now we are messing with shell options, we might decide that it is neater to have * match hidden files:

shopt -s dotglob
tar -C my_dir -zcvf my_dir.tar.gz *
shopt -u dotglob

This might not work where your shell globs * in the current directory, so alternatively, use:

shopt -s dotglob
cd my_dir
tar -zcvf ../my_dir.tar.gz *
cd ..
shopt -u dotglob

PanCrit ,Jun 14, 2010 at 6:47

cd my_directory
tar zcvf ../my_directory.tar.gz *

anion ,May 11, 2018 at 14:10

If it's a Unix/Linux system, and you care about hidden files (which will be missed by *), you need to do:
cd my_directory
tar zcvf ../my_directory.tar.gz * .??*

I don't know what hidden files look like under Windows.

gpz500 ,Feb 27, 2014 at 10:46

I would propose the following Bash function (first argument is the path to the dir, second argument is the basename of resulting archive):
function tar_dir_contents ()
{
    local DIRPATH="$1"
    local TARARCH="$2.tar.gz"
    local ORGIFS="$IFS"
    IFS=$'\n'
    tar -C "$DIRPATH" -czf "$TARARCH" $( ls -a "$DIRPATH" | grep -v '\(^\.$\)\|\(^\.\.$\)' )
    IFS="$ORGIFS"
}

You can run it in this way:

$ tar_dir_contents /path/to/some/dir my_archive

and it will generate the archive my_archive.tar.gz within current directory. It works with hidden (.*) elements and with elements with spaces in their filename.

med ,Feb 9, 2017 at 17:19

cd my_directory && tar -czvf ../my_directory.tar.gz $(ls -A) && cd ..

This one worked for me and it's include all hidden files without putting all files in a root directory named "." like in tomoe's answer :

Breno Salgado ,Apr 16, 2016 at 15:42

Use pax.

Pax is a deprecated package but does the job perfectly and in a simple fashion.

pax -w > mydir.tar mydir

asynts ,Jun 26 at 16:40

Simplest way I found:

cd my_dir && tar -czvf ../my_dir.tar.gz *

marcingo ,Aug 23, 2016 at 18:04

# tar all files within and deeper in a given directory
# with no prefixes ( neither <directory>/ nor ./ )
# parameters: <source directory> <target archive file>
function tar_all_in_dir {
    { cd "$1" && find -type f -print0; } \
    | cut --zero-terminated --characters=3- \
    | tar --create --file="$2" --directory="$1" --null --files-from=-
}

Safely handles filenames with spaces or other unusual characters. You can optionally add a -name '*.sql' or similar filter to the find command to limit the files included.

user1456599 ,Feb 13, 2013 at 21:37

 tar -cvzf  tarlearn.tar.gz --remove-files mytemp/*

If the folder is mytemp then if you apply the above it will zip and remove all the files in the folder but leave it alone

 tar -cvzf  tarlearn.tar.gz --remove-files --exclude='*12_2008*' --no-recursion mytemp/*

You can give exclude patterns and also specify not to look into subfolders too

Aaron Digulla ,Jun 2, 2009 at 15:33

tar -C my_dir -zcvf my_dir.tar.gz `ls my_dir`

[Jun 26, 2019] 7,000 Developers Report Their Top Languages: Java, JavaScript, and Python

The article mixes apples and oranges and demonstrates complete ignorance in the of language classification.
Two of the three top language are scripting languages. This is a huge victory. But Python has problems with efficiency (not that they matter everywhere) and is far from being an elegant language. It entered mainstream via the adoption it at universities as the first programming language, displacing Java (which I think might be a mistake -- I think teaching should start with assembler and replicate the history of development -- assembler -- compiled languages -- scripting language)
Perl which essentially heralded the era of scripting languages is now losing its audience and shrinks to its initial purpose -- the tool for Unix system administrators. But I think is such surveys its use is underreported for obvious reasons -- it is not fashionable. But please note that Fortran is still widely used.
Go is just veriant of a "better C" -- statically typed, compiled language. Rust is an attempt to improve C++. Both belong to the class of compiled languages. So complied language still hold their own and are important part of the ecosystem. See also How Rust Compares to Other Programming Languages - The New Stack
Jun 26, 2019 | developers.slashdot.org
The report surveyed about 7,000 developers worldwide, and revealed Python is the most studied programming language, the most loved language , and the third top primary programming language developers are using... The top use cases developers are using Python for include data analysis, web development, machine learning and writing automation scripts, according to the JetBrains report . More developers are also beginning to move over to Python 3, with 9 out of 10 developers using the current version.

The JetBrains report also found while Go is still a young language, it is the most promising programming language. "Go started out with a share of 8% in 2017 and now it has reached 18%. In addition, the biggest number of developers (13%) chose Go as a language they would like to adopt or migrate to," the report stated...

Seventy-three percent of JavaScript developers use TypeScript, which is up from 17 percent last year. Seventy-one percent of Kotlin developers use Kotlin for work. Java 8 is still the most popular programming language, but developers are beginning to migrate to Java 10 and 11.
JetBrains (which designed Kotlin in 2011) also said that 60% of their survey's respondents identified themselves as professional web back-end developers (while 46% said they did web front-end, and 23% developed mobile applications). 41% said they hadn't contributed to open source projects "but I would like to," while 21% said they contributed "several times a year."

"16% of developers don't have any tests in their projects. Among fully-employed senior developers though, that statistic is just 8%. Like last year, about 30% of developers still don't have unit tests in their projects." Other interesting statistics: 52% say they code in their dreams. 57% expect AI to replace developers "partially" in the future. "83% prefer the Dark theme for their editor or IDE. This represents a growth of 6 percentage points since last year for each environment. 47% take public transit to work.

And 97% of respondents using Rust "said they have been using Rust for less than a year. With only 14% using it for work, it's much more popular as a language for personal/side projects." And more than 90% of the Rust developers who responded worked with codebases with less than 300 files.

[Jun 23, 2019] Utilizing multi core for tar+gzip-bzip compression-decompression

Highly recommended!
Notable quotes:
"... There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. ..."
"... You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use. ..."
Jun 23, 2019 | stackoverflow.com

user1118764 , Sep 7, 2012 at 6:58

I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit).

I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression.

Is there any way I can utilize the unused cores to make it faster?

Warren Severin , Nov 13, 2017 at 4:37

The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37

Mark Adler , Sep 7, 2012 at 14:48

You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz:
tar cf - paths-to-archive | pigz > archive.tar.gz

By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g.

tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz

user788171 , Feb 20, 2013 at 12:43

How do you use pigz to decompress in the same fashion? Or does it only work for compression?

Mark Adler , Feb 20, 2013 at 16:18

pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression.

The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores.

Garrett , Mar 1, 2014 at 7:26

The hyphen here is stdout (see this page ).

Mark Adler , Jul 2, 2014 at 21:29

Yes. 100% compatible in both directions.

Mark Adler , Apr 23, 2015 at 5:23

There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files.

Jen , Jun 14, 2013 at 14:34

You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use.

For example use:

tar -c --use-compress-program=pigz -f tar.file dir_to_zip

Valerio Schiavoni , Aug 5, 2014 at 22:38

Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38

bovender , Sep 18, 2015 at 10:14

@ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14

Valerio Schiavoni , Sep 28, 2015 at 23:41

On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41

Offenso , Jan 11, 2017 at 17:26

I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26

Maxim Suslov , Dec 18, 2014 at 7:31

Common approach

There is option for tar program:

-I, --use-compress-program PROG
      filter through PROG (must accept -d)

You can use multithread version of archiver or compressor utility.

Most popular multithread archivers are pigz (instead of gzip) and pbzip2 (instead of bzip2). For instance:

$ tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive
$ tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive

Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary):

$ tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz
$ tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz

Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa.

p7zip

For p7zip for compression you need a small shell script like the following:

#!/bin/sh
case $1 in
  -d) 7za -txz -si -so e;;
   *) 7za -txz -si -so a .;;
esac 2>/dev/null

Save it as 7zhelper.sh. Here the example of usage:

$ tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive
$ tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z
xz

Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ).

This is a fragment of man for 5.1.0alpha version:

Multithreaded compression and decompression are not implemented yet, so this option has no effect for now.

However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2:

Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used.

Recompiling with replacement

If you build tar from sources, then you can recompile with parameters

--with-gzip=pigz
--with-bzip2=lbzip2
--with-lzip=plzip

After recompiling tar with these options you can check the output of tar's help:

$ tar --help | grep "lbzip2\|plzip\|pigz"
  -j, --bzip2                filter the archive through lbzip2
      --lzip                 filter the archive through plzip
  -z, --gzip, --gunzip, --ungzip   filter the archive through pigz

mpibzip2 , Apr 28, 2015 at 20:57

I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57

oᴉɹǝɥɔ , Jun 10, 2015 at 17:39

Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39

selurvedu , May 26, 2016 at 22:13

Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13

panticz.de , Sep 1, 2014 at 15:02

You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores:
tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/

einpoklum , Feb 11, 2017 at 15:59

A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59
If you want to have more flexibility with filenames and compression options, you can use:
find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \
tar -P --transform='s@/my/path/@@g' -cf - {} + | \
pigz -9 -p 4 > myarchive.tar.gz
Step 1: find

find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec

This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want.

-exec will execute the next command using the results of find : tar

Step 2: tar

tar -P --transform='s@/my/path/@@g' -cf - {} +

--transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included.

-P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading `/' from member names". Leading '/' with be removed by --transform anyway.

-cf - tells tar to use the tarball name we'll specify later

{} + uses everyfiles that find found previously

Step 3: pigz

pigz -9 -p 4

Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores.

Step 4: archive name

> myarchive.tar.gz

Finally.

[May 24, 2019] How to send keystrokes from one computer to another by USB?

Notable quotes:
"... On a different note, have you considered a purely software/network solution such as TightVNC ? ..."
Aug 05, 2018 | stackoverflow.com

Ask Question


Yehonatan ,Aug 5, 2018 at 6:34

Is there a way to use one computer to send keystrokes to another by usb ?

What i'm looking to do is to capture the usb signal used by a keyboard (with USBTrace for example) and use it with PC-1 to send it to PC-2. So that PC-2 recognize it as a regular keyboard input.

Some leads to do this would be very appreciated.

Lucas ,Jan 16, 2011 at 19:18

What you essentially need is a USB port on PC-1 that will act as a USB device for PC-2.

That is not possible for the vast majority of PC systems because USB is an asymmetric bus, with a host/device (or master/slave, if you wish) architecture. USB controllers (and their ports) on most PCs can only work in host mode and cannot simulate a device.

That is the reason that you cannot network computers through USB without a special cable with specialized electronics.

The only exception is if you somehow have a PC that supports the USB On-The-Go standard that allows for a USB port to act in both host and device mode. USB-OTG devices do exist, but they are usually embedded devices (smartphones etc). I don't know if there is a way to add a USB-OTG port to a commodity PC.

EDIT:

If you do not need a keyboard before the OS on PC-2 boots, you might be able to use a pair of USB Bluetooth dongles - one on each PC. You'd have to use specialised software on PC-1, but it is definitely possible - I've already seen a possible implementation on Linux , and I am reasonably certain that there must be one for Windows. You will also need Bluetooth HID drivers on PC-2, if they are not already installed.

On a different note, have you considered a purely software/network solution such as TightVNC ?

bebbo ,Sep 20, 2017 at 18:14

There is a solution: https://github.com/Flowm/etherkey

This uses a network connection from your computer to the raspi which is connected to a teensy (usb developer board) to send the key strokes.

This solution is not an out-of-the-box product. The required skill is similar to programming some other devices like arduino. But it's a complete and working setup.

Yehonatan ,Jan 25, 2011 at 5:51

The cheapest options are commercial microcontrollers (eg arduino platform, pic, etc) or ready built usb keyboard controllers (eg i-pac, arcade controllers,etc)

Benoit-Pierre DEMAINE ,Oct 27, 2017 at 17:17

SEARCH THIS PROGRAM:

TWedge: Keyboard Wedge Software (RS232, Serial, TCP, Bluetooth)

then, MAKE YOUR OWN CONNECTION CABLE WITH:

(usb <-> rs232) + (NULL MODEM) + (rs232 <-> usb)

Connect 2 computer, write your own program to send signal to your (usb <-> rs232) unit, then you can control another computer under the help of TWedge.

> ,

The above mentionned https://github.com/Flowm/etherkey is one way. The keyboard is emulated from an rPi, but the principle can be used from PC to PC (or Mac to Whatever). The core answer to your question is to use an OTG-capable chip, and then you control this chip via a USB-serial adapter.

https://euer.krebsco.de/a-software-kvm-switch.html uses a very similar method, using an Arduino instead of the Teensy.

The generic answer is: you need an OTG capable, or slave capable device: Arduino, Teensy, Pi 0 (either from Rapberry or Orange brands, both work; only the ZERO models are OTG capable), or, an rPi-A with heavy customisation (since it does not include USB hub, it can theoretically be converted into a slave; never found any public tutorial to do it), or any smartphone (Samsung, Nokia, HTC, Oukitel ... most smartphones are OTG capable). If you go for a Pi or a phone, then, you want to dig around USB Gadget. Cheaper solutions (Arduino/Teensy) need custom firmware.

[May 03, 2019] RedHat local repository and offline updates

Aug 03, 2018 | stackoverflow.com

My company just bought a two redhat license for two physical machines , the machines wont be accessible via internet , so we have an issue here regarding the updates , patches , ... etc .

i am thinking of configuring a local repository to be accessible via internet and have all the necessary updates but there is a problem here that i have only two licenses , is it applicable if i activate the local repository for the updates and one of my two service machines , or is there any other way like if there is some sort of offline package that i can download it separately from redhat and update my machines without internet access ?

thanks in advance

XXX

You have several options:

See How can we regularly update a disconnected system (A system without internet connection)? for details.

[Mar 20, 2019] How do I troubleshoot a yum repository problem that has an error No package available error?

Mar 20, 2019 | unix.stackexchange.com

Kiran ,Jan 2, 2017 at 23:57

I have three RHEL 6.6 servers. One has a yum repository that I know works. The other two servers I will refer to as "yum clients." These two are configured to use the same yum repository (the first server described). When I do yum install httpd on each of these two yum client servers, I get two different results. One server prepares for the installation as normal and prompts me with a y/n prompt. The second server says

No package httpd available.

The /etc/yum.conf files on each of the two servers is identical. The /etc/yum.repos.d/ directories have the same .repo files. Why does one yum client not see the httpd package? I use httpd as an example. One yum client cannot install any package. The other yum client can install anything. Neither have access to the Internet or different servers the other one does not have access to.

XXX

If /etc/yum.conf is identical on all servers, and that package is not listed there in exclude line, check if the repo is enabled on all the servers.

Do

grep enabled /etc/yum.repos.d/filename.repo

and see if it is set to 0 or 1.

value of enabled needs to be set to 1, for yum to use that repo.

If repo is not enabled, you can edit the repo file, and change the enable to 1, or try to run yum with enablerepo switch, to enable it for that operation.

Try to run yum like this.

yum --enablerepo=repo_name install package_name

[Mar 20, 2019] How to I print to STDERR only if STDOUT is a different destination?

Mar 14, 2013 | stackoverflow.com

squiguy, Mar 14, 2013 at 19:06

I would like Perl to write to STDERR only if STDOUT is not the same. For example, if both STDOUT and STDERR would redirect output to the Terminal, then I don't want STDERR to be printed.

Consider the following example (outerr.pl):

#!/usr/bin/perl

use strict;
use warnings;

print STDOUT "Hello standard output!\n";
print STDERR "Hello standard error\n" if ($someMagicalFlag);
exit 0

Now consider this (this is what I would like to achieve):

bash $ outerr.pl
Hello standard output!

However, if I redirect out to a file, I'd like to get:

bash $ outerr.pl > /dev/null
Hello standard error

and similary the other way round:

bash $ outerr.pl 2> /dev/null
Hello standard output!

If I re-direct both out/err to the same file, then only stdout should be displayed:

bash $ outerr.pl > foo.txt 2>&1
bash $ cat foo.txt
Hello standard output!

So is there a way to evaluate / determine whether OUT and ERR and are pointing to the same "thing" (descriptor?)?

tchrist ,Mar 15, 2013 at 5:07

On Unix-style systems, you should be able to do:
my @stat_err = stat STDERR;
my @stat_out = stat STDOUT;

my $stderr_is_not_stdout = (($stat_err[0] != $stat_out[0]) ||
                            ($stat_err[1] != $stat_out[1]));

But that won't work on Windows, which doesn't have real inode numbers. It gives both false positives (thinks they're different when they aren't) and false negatives (thinks they're the same when they aren't).

Jim Stewart ,Mar 14, 2013 at 20:59

You can do that (almost) with -t:
-t STDERR

will be true if it is a terminal, and likewise for STDOUT.

This still would not tell you what terminal, and if you redirect to the same file, you may stilll get both.

Hence, if

-t STDERR && ! (-t STDOUT) || -t STDOUT && !(-t STDERR)

or shorter

-t STDOUT ^ -t STDERR  # thanks to @mob

you know you're okay.

EDIT: Solutions for the case that both STDERR and STDOUT are regular files:

Tom Christianson suggested to stat and compare the dev and ino fields. This will work in UNIX, but, as @cjm pointed out, not in Windows.

If you can guarantee that no other program will write to the file, you could do the following both in Windows and UNIX:

  1. check the position the file descriptors for STDOUT and STDERR are at, if they are not equal, you redirected one of them with >> to a nonempty file.
  2. Otherwise, write 42 bytes to file descriptor 2
  3. Seek to the end of file descriptor 1. If it is 42 more than before, chances are high that both are redirected to the same file. If it is unchanged, files are different. If it is changed, but not by 42, someone else is writing there, all bets are off (but then, you're not in Windows, so the stat method will work).

[Mar 17, 2019] Translating Perl to Python

Mar 17, 2019 | stackoverflow.com

21


John Kugelman ,Jul 1, 2009 at 3:29

I found this Perl script while migrating my SQLite database to mysql

I was wondering (since I don't know Perl) how could one rewrite this in Python?

Bonus points for the shortest (code) answer :)

edit : sorry I meant shortest code, not strictly shortest answer

#! /usr/bin/perl

while ($line = <>){
    if (($line !~  /BEGIN TRANSACTION/) && ($line !~ /COMMIT/) && ($line !~ /sqlite_sequence/) && ($line !~ /CREATE UNIQUE INDEX/)){

        if ($line =~ /CREATE TABLE \"([a-z_]*)\"(.*)/){
                $name = $1;
                $sub = $2;
                $sub =~ s/\"//g; #"
                $line = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";
        }
        elsif ($line =~ /INSERT INTO \"([a-z_]*)\"(.*)/){
                $line = "INSERT INTO $1$2\n";
                $line =~ s/\"/\\\"/g; #"
                $line =~ s/\"/\'/g; #"
        }else{
                $line =~ s/\'\'/\\\'/g; #'
        }
        $line =~ s/([^\\'])\'t\'(.)/$1THIS_IS_TRUE$2/g; #'
        $line =~ s/THIS_IS_TRUE/1/g;
        $line =~ s/([^\\'])\'f\'(.)/$1THIS_IS_FALSE$2/g; #'
        $line =~ s/THIS_IS_FALSE/0/g;
        $line =~ s/AUTOINCREMENT/AUTO_INCREMENT/g;
        print $line;
    }
}

Some additional code was necessary to successfully migrate the sqlite database (handles one line Create table statements, foreign keys, fixes a bug in the original program that converted empty fields '' to \' .

I posted the code on the migrating my SQLite database to mysql Question

Jiaaro ,Jul 2, 2009 at 10:15

Here's a pretty literal translation with just the minimum of obvious style changes (putting all code into a function, using string rather than re operations where possible).
import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    print line,

main()

dr jimbob ,May 20, 2018 at 0:54

Alex Martelli's solution above works good, but needs some fixes and additions:

In the lines using regular expression substitution, the insertion of the matched groups must be double-escaped OR the replacement string must be prefixed with r to mark is as regular expression:

line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)

or

line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)

Also, this line should be added before print:

line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')

Last, the column names in create statements should be backticks in MySQL. Add this in line 15:

  sub = sub.replace('"','`')

Here's the complete script with modifications:

import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      sub = sub.replace('"','`')
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", "\\1THIS_IS_FALSE\\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    if re.search('^CREATE INDEX', line):
        line = line.replace('"','`')
    print line,

main()

Brad Gilbert ,Jul 1, 2009 at 18:43

Here is a slightly better version of the original.
#! /usr/bin/perl
use strict;
use warnings;
use 5.010; # for s/\K//;

while( <> ){
  next if m'
    BEGIN TRANSACTION   |
    COMMIT              |
    sqlite_sequence     |
    CREATE UNIQUE INDEX
  'x;

  if( my($name,$sub) = m'CREATE TABLE \"([a-z_]*)\"(.*)' ){
    # remove "
    $sub =~ s/\"//g; #"
    $_ = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";

  }elsif( /INSERT INTO \"([a-z_]*)\"(.*)/ ){
    $_ = "INSERT INTO $1$2\n";

    # " => \"
    s/\"/\\\"/g; #"
    # " => '
    s/\"/\'/g; #"

  }else{
    # '' => \'
    s/\'\'/\\\'/g; #'
  }

  # 't' => 1
  s/[^\\']\K\'t\'/1/g; #'

  # 'f' => 0
  s/[^\\']\K\'f\'/0/g; #'

  s/AUTOINCREMENT/AUTO_INCREMENT/g;
  print;
}

Mickey Mouse ,Jun 14, 2011 at 15:48

all of scripts on this page can't deal with simple sqlite3:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Filename (
  FilenameId INTEGER,
  Name TEXT DEFAULT '',
  PRIMARY KEY(FilenameId) 
  );
INSERT INTO "Filename" VALUES(1,'');
INSERT INTO "Filename" VALUES(2,'bigfile1');
INSERT INTO "Filename" VALUES(3,'%gconf-tree.xml');

None were able to reformat "table_name" into proper mysql's `table_name` . Some messed up empty string value.

Sinan Ünür ,Jul 1, 2009 at 3:24

I am not sure what is so hard to understand about this that it requires a snide remark as in your comment above. Note that <> is called the diamond operator. s/// is the substitution operator and // is the match operator m// .

Ken_g6 ,Jul 1, 2009 at 3:22

Based on http://docs.python.org/dev/howto/regex.html ...
  1. Replace $line =~ /.*/ with re.search(r".*", line) .
  2. $line !~ /.*/ is just !($line =~ /.*/) .
  3. Replace $line =~ s/.*/x/g with line=re.sub(r".*", "x", line) .
  4. Replace $1 through $9 inside re.sub with \1 through \9 respectively.
  5. Outside a sub, save the return value, i.e. m=re.search() , and replace $1 with the return value of m.group(1) .
  6. For "INSERT INTO $1$2\n" specifically, you can do "INSERT INTO %s%s\n" % (m.group(1), m.group(2)) .

hpavc ,Jul 1, 2009 at 12:33

Real issue is do you know actually how to migrate the database? What is presented is merely a search and replace loop.

> ,

Shortest? The tilde signifies a regex in perl. "import re" and go from there. The only key differences are that you'll be using \1 and \2 instead of $1 and $2 when you assign values, and you'll be using %s for when you're replacing regexp matches inside strings.

[Mar 16, 2019] Translating Perl to Python - Stack Overflow

Mar 16, 2019 | stackoverflow.com

Translating Perl to Python Ask Question 21


John Kugelman ,Jul 1, 2009 at 3:29

I found this Perl script while migrating my SQLite database to mysql

I was wondering (since I don't know Perl) how could one rewrite this in Python?

Bonus points for the shortest (code) answer :)

edit : sorry I meant shortest code, not strictly shortest answer

#! /usr/bin/perl

while ($line = <>){
    if (($line !~  /BEGIN TRANSACTION/) && ($line !~ /COMMIT/) && ($line !~ /sqlite_sequence/) && ($line !~ /CREATE UNIQUE INDEX/)){

        if ($line =~ /CREATE TABLE \"([a-z_]*)\"(.*)/){
                $name = $1;
                $sub = $2;
                $sub =~ s/\"//g; #"
                $line = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";
        }
        elsif ($line =~ /INSERT INTO \"([a-z_]*)\"(.*)/){
                $line = "INSERT INTO $1$2\n";
                $line =~ s/\"/\\\"/g; #"
                $line =~ s/\"/\'/g; #"
        }else{
                $line =~ s/\'\'/\\\'/g; #'
        }
        $line =~ s/([^\\'])\'t\'(.)/$1THIS_IS_TRUE$2/g; #'
        $line =~ s/THIS_IS_TRUE/1/g;
        $line =~ s/([^\\'])\'f\'(.)/$1THIS_IS_FALSE$2/g; #'
        $line =~ s/THIS_IS_FALSE/0/g;
        $line =~ s/AUTOINCREMENT/AUTO_INCREMENT/g;
        print $line;
    }
}

Some additional code was necessary to successfully migrate the sqlite database (handles one line Create table statements, foreign keys, fixes a bug in the original program that converted empty fields '' to \' .

I posted the code on the migrating my SQLite database to mysql Question

Jiaaro ,Jul 2, 2009 at 10:15

Here's a pretty literal translation with just the minimum of obvious style changes (putting all code into a function, using string rather than re operations where possible).
import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    print line,

main()

dr jimbob ,May 20, 2018 at 0:54

Alex Martelli's solution above works good, but needs some fixes and additions:

In the lines using regular expression substitution, the insertion of the matched groups must be double-escaped OR the replacement string must be prefixed with r to mark is as regular expression:

line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)

or

line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)

Also, this line should be added before print:

line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')

Last, the column names in create statements should be backticks in MySQL. Add this in line 15:

  sub = sub.replace('"','`')

Here's the complete script with modifications:

import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      sub = sub.replace('"','`')
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", "\\1THIS_IS_FALSE\\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    if re.search('^CREATE INDEX', line):
        line = line.replace('"','`')
    print line,

main()

Brad Gilbert ,Jul 1, 2009 at 18:43

Here is a slightly better version of the original.
#! /usr/bin/perl
use strict;
use warnings;
use 5.010; # for s/\K//;

while( <> ){
  next if m'
    BEGIN TRANSACTION   |
    COMMIT              |
    sqlite_sequence     |
    CREATE UNIQUE INDEX
  'x;

  if( my($name,$sub) = m'CREATE TABLE \"([a-z_]*)\"(.*)' ){
    # remove "
    $sub =~ s/\"//g; #"
    $_ = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";

  }elsif( /INSERT INTO \"([a-z_]*)\"(.*)/ ){
    $_ = "INSERT INTO $1$2\n";

    # " => \"
    s/\"/\\\"/g; #"
    # " => '
    s/\"/\'/g; #"

  }else{
    # '' => \'
    s/\'\'/\\\'/g; #'
  }

  # 't' => 1
  s/[^\\']\K\'t\'/1/g; #'

  # 'f' => 0
  s/[^\\']\K\'f\'/0/g; #'

  s/AUTOINCREMENT/AUTO_INCREMENT/g;
  print;
}

Mickey Mouse ,Jun 14, 2011 at 15:48

all of scripts on this page can't deal with simple sqlite3:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Filename (
  FilenameId INTEGER,
  Name TEXT DEFAULT '',
  PRIMARY KEY(FilenameId) 
  );
INSERT INTO "Filename" VALUES(1,'');
INSERT INTO "Filename" VALUES(2,'bigfile1');
INSERT INTO "Filename" VALUES(3,'%gconf-tree.xml');

None were able to reformat "table_name" into proper mysql's `table_name` . Some messed up empty string value.

Sinan Ünür ,Jul 1, 2009 at 3:24

I am not sure what is so hard to understand about this that it requires a snide remark as in your comment above. Note that <> is called the diamond operator. s/// is the substitution operator and // is the match operator m// .

Ken_g6 ,Jul 1, 2009 at 3:22

Based on http://docs.python.org/dev/howto/regex.html ...
  1. Replace $line =~ /.*/ with re.search(r".*", line) .
  2. $line !~ /.*/ is just !($line =~ /.*/) .
  3. Replace $line =~ s/.*/x/g with line=re.sub(r".*", "x", line) .
  4. Replace $1 through $9 inside re.sub with \1 through \9 respectively.
  5. Outside a sub, save the return value, i.e. m=re.search() , and replace $1 with the return value of m.group(1) .
  6. For "INSERT INTO $1$2\n" specifically, you can do "INSERT INTO %s%s\n" % (m.group(1), m.group(2)) .

hpavc ,Jul 1, 2009 at 12:33

Real issue is do you know actually how to migrate the database? What is presented is merely a search and replace loop.

> ,

Shortest? The tilde signifies a regex in perl. "import re" and go from there. The only key differences are that you'll be using \1 and \2 instead of $1 and $2 when you assign values, and you'll be using %s for when you're replacing regexp matches inside strings.

[Mar 16, 2019] Regex translation from Perl to Python - Stack Overflow

Mar 16, 2019 | stackoverflow.com

Regex translation from Perl to Python Ask Question 1


royskatt ,Jan 30, 2014 at 14:45

I would like to rewrite a small Perl programm to Python. I am processing text files with it as follows:

Input:

00000001;Root;;
00000002;  Documents;;
00000003;    oracle-advanced_plsql.zip;file;
00000004;  Public;;
00000005;  backup;;
00000006;    20110323-JM-F.7z.001;file;
00000007;    20110426-JM-F.7z.001;file;
00000008;    20110603-JM-F.7z.001;file;
00000009;    20110701-JM-F-via-summer_school;;
00000010;      20110701-JM-F-yyy.7z.001;file;

Desired output:

00000001;;Root;;
00000002;  ;Documents;;
00000003;    ;oracle-advanced_plsql.zip;file;
00000004;  ;Public;;
00000005;  ;backup;;
00000006;    ;20110323-JM-F.7z.001;file;
00000007;    ;20110426-JM-F.7z.001;file;
00000008;    ;20110603-JM-F.7z.001;file;
00000009;    ;20110701-JM-F-via-summer_school;;
00000010;      ;20110701-JM-F-yyy.7z.001;file;

Here is the working Perl code:

#filename: perl_regex.pl
#/usr/bin/perl -w
while(<>) {                                                           
  s/^(.*?;.*?)(\w)/$1;$2/;                                            
  print $_;                                                           
}

It call it from the command line: perl_regex.pl input.txt

Explanation of the Perl-style regex:

s/        # start search-and-replace regexp
  ^       # start at the beginning of this line
  (       # save the matched characters until ')' in $1
    .*?;  # go forward until finding the first semicolon
    .*?   # go forward until finding... (to be continued below)
  )
  (       # save the matched characters until ')' in $2
    \w    # ... the next alphanumeric character.
  )
/         # continue with the replace part
  $1;$2   # write all characters found above, but insert a ; before $2
/         # finish the search-and-replace regexp.

Could anyone tell me, how to get the same result in Python? Especially for the $1 and $2 variables I couldn't find something alike.

royskatt ,Jan 31, 2014 at 6:18

Python regular expression is very similar to Perl's, except:

Use re.sub to replace.

import re
import sys

for line in sys.stdin: # Explicitly iterate standard input line by line
    # `line` contains trailing newline!
    line = re.sub(r'^(.*?;.*?)(\w)', r'\1;\2', line)
    #print(line) # This print trailing newline
    sys.stdout.write(line) # Print the replaced string back.

royskatt ,Jan 31, 2014 at 16:36

The replace instruction for s/pattern/replace/ in python regexes is the re.sub(pattern, replace, string) function, or re.compile(pattern).sub(replace, string). In your case, you will do it so:
_re_pattern = re.compile(r"^(.*?;.*?)(\w)")
result = _re_pattern.sub(r"\1;\2", line)

Note that $1 becomes \1 . As for perl, you need to iterate over your lines the way you want to do it (open, inputfile, splitlines, ...).

[Mar 13, 2019] amp html - Convert img to amp-img - Stack Overflow

Mar 13, 2019 | stackoverflow.com

> ,

Which is the default way to convert an <img> to a <amp-img> ?

I explain myself: In the site that I'm converting to AMP I have lot of images without widht and height e.g.:

<img src="/img/image.png" alt="My image">

If I not specify the layout, the layout="container" is set by default and the most of the images throw the following error:

amp-img error: Layout not supported for: container

In the other hand, the most of the images don't fit with the responsive layout, which is recommended by Google for most of the cases

I have been checking the types of layout on the documentation:

But any of them seems to fit with an image that have to be shown as its real size, not specifying width or height.

So, in that case, which is the equivalent in AMP?

,

As you are saying you have multiple images, it's better you use the ' layout="responsive" ', with that, you will make your images responsive atleast.

Now regarding the Width and Height . They are must.

If you read the purpose of AMP, one of them is to make the pages ' Jumping/Flickering Free Content ', which happens if there is no width mentioned for Images.

By Specifying the Width, the Browser (mobile browser), can calculate the precise space and keep it for that Image and show the Content after that. In that way, there wont' be any flickering of the content, as the page and images are loaded.

Regarding the re-writing of your HTML, one tip I can provide is, you can write some small utility with PHP, Python or Node JavaScript, which can actually read the source image, calculate their dimensions and replace your IMG tags.

Hope this helps and wish you good luck for your AMP powered site :-)

[Mar 10, 2019] How do I detach a process from Terminal, entirely?

Mar 10, 2019 | superuser.com

stackoverflow.com, Aug 25, 2016 at 17:24

I use Tilda (drop-down terminal) on Ubuntu as my "command central" - pretty much the way others might use GNOME Do, Quicksilver or Launchy.

However, I'm struggling with how to completely detach a process (e.g. Firefox) from the terminal it's been launched from - i.e. prevent that such a (non-)child process

For example, in order to start Vim in a "proper" terminal window, I have tried a simple script like the following:

exec gnome-terminal -e "vim $@" &> /dev/null &

However, that still causes pollution (also, passing a file name doesn't seem to work).

lhunath, Sep 23, 2016 at 19:08

First of all; once you've started a process, you can background it by first stopping it (hit Ctrl - Z ) and then typing bg to let it resume in the background. It's now a "job", and its stdout / stderr / stdin are still connected to your terminal.

You can start a process as backgrounded immediately by appending a "&" to the end of it:

firefox &

To run it in the background silenced, use this:

firefox </dev/null &>/dev/null &

Some additional info:

nohup is a program you can use to run your application with such that its stdout/stderr can be sent to a file instead and such that closing the parent script won't SIGHUP the child. However, you need to have had the foresight to have used it before you started the application. Because of the way nohup works, you can't just apply it to a running process .

disown is a bash builtin that removes a shell job from the shell's job list. What this basically means is that you can't use fg , bg on it anymore, but more importantly, when you close your shell it won't hang or send a SIGHUP to that child anymore. Unlike nohup , disown is used after the process has been launched and backgrounded.

What you can't do, is change the stdout/stderr/stdin of a process after having launched it. At least not from the shell. If you launch your process and tell it that its stdout is your terminal (which is what you do by default), then that process is configured to output to your terminal. Your shell has no business with the processes' FD setup, that's purely something the process itself manages. The process itself can decide whether to close its stdout/stderr/stdin or not, but you can't use your shell to force it to do so.

To manage a background process' output, you have plenty of options from scripts, "nohup" probably being the first to come to mind. But for interactive processes you start but forgot to silence ( firefox < /dev/null &>/dev/null & ) you can't do much, really.

I recommend you get GNU screen . With screen you can just close your running shell when the process' output becomes a bother and open a new one ( ^Ac ).


Oh, and by the way, don't use " $@ " where you're using it.

$@ means, $1 , $2 , $3 ..., which would turn your command into:

gnome-terminal -e "vim $1" "$2" "$3" ...

That's probably not what you want because -e only takes one argument. Use $1 to show that your script can only handle one argument.

It's really difficult to get multiple arguments working properly in the scenario that you gave (with the gnome-terminal -e ) because -e takes only one argument, which is a shell command string. You'd have to encode your arguments into one. The best and most robust, but rather cludgy, way is like so:

gnome-terminal -e "vim $(printf "%q " "$@")"

Limited Atonement ,Aug 25, 2016 at 17:22

nohup cmd &

nohup detaches the process completely (daemonizes it)

Randy Proctor ,Sep 13, 2016 at 23:00

If you are using bash , try disown [ jobspec ] ; see bash(1) .

Another approach you can try is at now . If you're not superuser, your permission to use at may be restricted.

Stephen Rosen ,Jan 22, 2014 at 17:08

Reading these answers, I was under the initial impression that issuing nohup <command> & would be sufficient. Running zsh in gnome-terminal, I found that nohup <command> & did not prevent my shell from killing child processes on exit. Although nohup is useful, especially with non-interactive shells, it only guarantees this behavior if the child process does not reset its handler for the SIGHUP signal.

In my case, nohup should have prevented hangup signals from reaching the application, but the child application (VMWare Player in this case) was resetting its SIGHUP handler. As a result when the terminal emulator exits, it could still kill your subprocesses. This can only be resolved, to my knowledge, by ensuring that the process is removed from the shell's jobs table. If nohup is overridden with a shell builtin, as is sometimes the case, this may be sufficient, however, in the event that it is not...


disown is a shell builtin in bash , zsh , and ksh93 ,

<command> &
disown

or

<command> &; disown

if you prefer one-liners. This has the generally desirable effect of removing the subprocess from the jobs table. This allows you to exit the terminal emulator without accidentally signaling the child process at all. No matter what the SIGHUP handler looks like, this should not kill your child process.

After the disown, the process is still a child of your terminal emulator (play with pstree if you want to watch this in action), but after the terminal emulator exits, you should see it attached to the init process. In other words, everything is as it should be, and as you presumably want it to be.

What to do if your shell does not support disown ? I'd strongly advocate switching to one that does, but in the absence of that option, you have a few choices.

  1. screen and tmux can solve this problem, but they are much heavier weight solutions, and I dislike having to run them for such a simple task. They are much more suitable for situations in which you want to maintain a tty, typically on a remote machine.
  2. For many users, it may be desirable to see if your shell supports a capability like zsh's setopt nohup . This can be used to specify that SIGHUP should not be sent to the jobs in the jobs table when the shell exits. You can either apply this just before exiting the shell, or add it to shell configuration like ~/.zshrc if you always want it on.
  3. Find a way to edit the jobs table. I couldn't find a way to do this in tcsh or csh , which is somewhat disturbing.
  4. Write a small C program to fork off and exec() . This is a very poor solution, but the source should only consist of a couple dozen lines. You can then pass commands as commandline arguments to the C program, and thus avoid a process specific entry in the jobs table.

Sheljohn ,Jan 10 at 10:20

  1. nohup $COMMAND &
  2. $COMMAND & disown
  3. setsid command

I've been using number 2 for a very long time, but number 3 works just as well. Also, disown has a 'nohup' flag of '-h', can disown all processes with '-a', and can disown all running processes with '-ar'.

Silencing is accomplished by '$COMMAND &>/dev/null'.

Hope this helps!

dunkyp

add a comment ,Mar 25, 2009 at 1:51
I think screen might solve your problem

Nathan Fellman ,Mar 23, 2009 at 14:55

in tcsh (and maybe in other shells as well), you can use parentheses to detach the process.

Compare this:

> jobs # shows nothing
> firefox &
> jobs
[1]  + Running                       firefox

To this:

> jobs # shows nothing
> (firefox &)
> jobs # still shows nothing
>

This removes firefox from the jobs listing, but it is still tied to the terminal; if you logged in to this node via 'ssh', trying to log out will still hang the ssh process.

,

To disassociate tty shell run command through sub-shell for e.g.

(command)&

When exit used terminal closed but process is still alive.

check -

(sleep 100) & exit

Open other terminal

ps aux | grep sleep

Process is still alive.

[Mar 10, 2019] How to run tmux/screen with systemd 230 ?

Aug 02, 2018 | askubuntu.com

MvanGeest ,May 10, 2017 at 20:59

I run 16.04 and systemd now kills tmux when the user disconnects ( summary of the change ).

Is there a way to run tmux or screen (or any similar program) with systemd 230? I read all the heated disussion about pros and cons of the behavious but no solution was suggested.

(I see the behaviour in 229 as well)

WoJ ,Aug 2, 2016 at 20:30

RemainAfterExit=

Takes a boolean value that specifies whether the service shall be considered active even when all its processes exited. Defaults to no.

jpath ,Feb 13 at 12:29

The proper solution is to disable the offending systemd behavior system-wide.

Edit /etc/systemd/logind.conf ( you must sudo , of course) and set

KillUserProcesses=no

You can also put this setting in a separate file, e.g. /etc/systemd/logind.conf.d/99-dont-kill-user-processes.conf .

Then restart systemd-logind.service .

sudo systemctl restart systemd-logind

sarnold ,Dec 9, 2016 at 11:59

Based on @Rinzwind's answer and inspired by a unit description the best I could find is to use TaaS (Tmux as a Service) - a generic detached instance of tmux one reattaches to.
# cat /etc/systemd/system/tmux@.service

[Unit]
Description=tmux default session (detached)
Documentation=man:tmux(1)

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/tmux new-session -d -s %I
ExecStop=/usr/bin/tmux kill-server
KillMode=none

[Install]
WantedBy=multiplexer.target

# systemctl start tmux@instanceone.service
# systemctl start tmux@instancetwo.service
# tmux list-sessions

instanceone: 1 windows (created Sun Jul 24 00:52:15 2016) [193x49]
instancetwo: 1 windows (created Sun Jul 24 00:52:19 2016) [193x49]

# tmux attach-session -t instanceone

(instanceone)#

Robin Hartmann ,Aug 2, 2018 at 20:23

You need to set the Type of the service to forking , as explained here .

Let's assume the service you want to run in screen is called minecraft . Then you would open minecraft.service in a text editor and add or edit the entry Type=forking under the section [Service] .

> ,

According to https://unix.stackexchange.com/a/287282/117599 invoking tmux using
systemd-run --user --scope tmux

should also do the trick.

[Mar 10, 2019] linux - How to attach terminal to detached process

Mar 10, 2019 | unix.stackexchange.com

Ask Question 86


Gilles ,Feb 16, 2012 at 21:39

I have detached a process from my terminal, like this:
$ process &

That terminal is now long closed, but process is still running and I want to send some commands to that process's stdin. Is that possible?

Samuel Edwin Ward ,Dec 22, 2018 at 13:34

Yes, it is. First, create a pipe: mkfifo /tmp/fifo . Use gdb to attach to the process: gdb -p PID

Then close stdin: call close (0) ; and open it again: call open ("/tmp/fifo", 0600)

Finally, write away (from a different terminal, as gdb will probably hang):

echo blah > /tmp/fifo

NiKiZe ,Jan 6, 2017 at 22:52

When original terminal is no longer accessible...

reptyr might be what you want, see https://serverfault.com/a/284795/187998

Quote from there:

Have a look at reptyr , which does exactly that. The github page has all the information.
reptyr - A tool for "re-ptying" programs.

reptyr is a utility for taking an existing running program and attaching it to a new terminal. Started a long-running process over ssh, but have to leave and don't want to interrupt it? Just start a screen, use reptyr to grab it, and then kill the ssh session and head on home.

USAGE

reptyr PID

"reptyr PID" will grab the process with id PID and attach it to your current terminal.

After attaching, the process will take input from and write output to the new terminal, including ^C and ^Z. (Unfortunately, if you background it, you will still have to run "bg" or "fg" in the old terminal. This is likely impossible to fix in a reasonable way without patching your shell.)

manatwork ,Nov 20, 2014 at 22:59

I am quite sure you can not.

Check using ps x . If a process has a ? as controlling tty , you can not send input to it any more.

9942 ?        S      0:00 tail -F /var/log/messages
9947 pts/1    S      0:00 tail -F /var/log/messages

In this example, you can send input to 9947 doing something like echo "test" > /dev/pts/1 . The other process ( 9942 ) is not reachable.

Next time, you could use screen or tmux to avoid this situation.

Stéphane Gimenez ,Feb 16, 2012 at 16:16

EDIT : As Stephane Gimenez said, it's not that simple. It's only allowing you to print to a different terminal.

You can try to write to this process using /proc . It should be located in /proc/ pid /fd/0 , so a simple :

echo "hello" > /proc/PID/fd/0

should do it. I have not tried it, but it should work, as long as this process still has a valid stdin file descriptor. You can check it with ls -l on /proc/ pid /fd/ .

See nohup for more details about how to keep processes running.

Stéphane Gimenez ,Nov 20, 2015 at 5:08

Just ending the command line with & will not completely detach the process, it will just run it in the background. (With zsh you can use &! to actually detach it, otherwise you have do disown it later).

When a process runs in the background, it won't receive input from its controlling terminal anymore. But you can send it back into the foreground with fg and then it will read input again.

Otherwise, it's not possible to externally change its filedescriptors (including stdin) or to reattach a lost controlling terminal unless you use debugging tools (see Ansgar's answer , or have a look at the retty command).

[Mar 10, 2019] linux - Preventing tmux session created by systemd from automatically terminating on Ctrl+C - Stack Overflow

Mar 10, 2019 | stackoverflow.com

Preventing tmux session created by systemd from automatically terminating on Ctrl+C Ask Question -1


Jim Stewart ,Nov 10, 2018 at 12:55

Since a few days I'm successfully running the new Minecraft Bedrock Edition dedicated server on my Ubuntu 18.04 LTS home server. Because it should be available 24/7 and automatically startup after boot I created a systemd service for a detached tmux session:

tmux.minecraftserver.service

[Unit]
Description=tmux minecraft_server detached

[Service]
Type=forking
WorkingDirectory=/home/mine/minecraftserver
ExecStart=/usr/bin/tmux new -s minecraftserver -d "LD_LIBRARY_PATH=. /home/mine/minecraftser$
User=mine

[Install]
WantedBy=multi-user.target

Everything works as expected but there's one tiny thing that keeps bugging me:

How can I prevent tmux from terminating it's whole session when I press Ctrl+C ? I just want to terminate the Minecraft server process itself instead of the whole tmux session. When starting the server from the command line in a manually created tmux session this does work (session stays alive) but not when the session was brought up by systemd .

FlKo ,Nov 12, 2018 at 6:21

When starting the server from the command line in a manually created tmux session this does work (session stays alive) but not when the session was brought up by systemd .

The difference between these situations is actually unrelated to systemd. In one case, you're starting the server from a shell within the tmux session, and when the server terminates, control returns to the shell. In the other case, you're starting the server directly within the tmux session, and when it terminates there's no shell to return to, so the tmux session also dies.

tmux has an option to keep the session alive after the process inside it dies (look for remain-on-exit in the manpage), but that's probably not what you want: you want to be able to return to an interactive shell, to restart the server, investigate why it died, or perform maintenance tasks, for example. So it's probably better to change your command to this:

'LD_LIBRARY_PATH=. /home/mine/minecraftserver/ ; exec bash'

That is, first run the server, and then, after it terminates, replace the process (the shell which tmux implicitly spawns to run the command, but which will then exit) with another, interactive shell. (For some other ways to get an interactive shell after the command exits, see e. g. this question – but note that the <(echo commands) syntax suggested in the top answer is not available in systemd unit files.)

FlKo ,Nov 12, 2018 at 6:21

I as able to solve this by using systemd's ExecStartPost and tmux's send-keys like this:
[Unit]
Description=tmux minecraft_server detached

[Service]
Type=forking
WorkingDirectory=/home/mine/minecraftserver
ExecStart=/usr/bin/tmux new -d -s minecraftserver
ExecStartPost=/usr/bin/tmux send-keys -t minecraftserver "cd /home/mine/minecraftserver/" Enter "LD_LIBRARY_PATH=. ./bedrock_server" Enter

User=mine

[Install]
WantedBy=multi-user.target

[Mar 01, 2019] Creating symlinks instead of /bin /sbin /lib and /lib64 directories in RHEL7

That change essentially means that /usr should be on the root partition, not on a separate partition which with the current sizes of harddrive is a resobale requirement.
Notable quotes:
"... On Linux /bin and /usr/bin are still separate because it is common to have /usr on a separate partition (although this configuration breaks in subtle ways, sometimes). In /bin is all the commands that you will need if you only have / mounted. ..."
Mar 01, 2019 | unix.stackexchange.com

balki ,May 2, 2015 at 6:17

What? no /bin/ is not a symlink to /usr/bin on any FHS compliant system. Note that there are still popular Unixes and Linuxes that ignore this - for example, /bin and /sbin are symlinked to /usr/bin on Arch Linux (the reasoning being that you don't need /bin for rescue/single-user-mode, since you'd just boot a live CD).

/bin

contains commands that may be used by both the system administrator and by users, but which are required when no other filesystems are mounted (e.g. in single user mode). It may also contain commands which are used indirectly by scripts

/usr/bin/

This is the primary directory of executable commands on the system.

essentially, /bin contains executables which are required by the system for emergency repairs, booting, and single user mode. /usr/bin contains any binaries that aren't required.

I will note, that they can be on separate disks/partitions, /bin must be on the same disk as / . /usr/bin can be on another disk - although note that this configuration has been kind of broken for a while (this is why e.g. systemd warns about this configuration on boot).

For full correctness, some unices may ignore FHS, as I believe it is only a Linux Standard, I'm not aware that it has yet been included in SUS, Posix or any other UNIX standard, though it should be IMHO. It is a part of the LSB standard though.

LawrenceC ,Jan 13, 2015 at 16:12

/sbin - Binaries needed for booting, low-level system repair, or maintenance (run level 1 or S)

/bin - Binaries needed for normal/standard system functioning at any run level.

/usr/bin - Application/distribution binaries meant to be accessed by locally logged in users

/usr/sbin - Application/distribution binaries that support or configure stuff in /sbin.

/usr/share/bin - Application/distribution binaries or scripts meant to be accessed via the web, i.e. Apache web applications

*local* - Binaries not part of a distribution; locally compiled or manually installed. There's usually never a /local/bin but always a /usr/local/bin and /usr/local/share/bin .

JonnyJD ,Jan 3, 2013 at 0:17

Some kind of "update" on this issue:

Recently some Linux distributions are merging /bin into /usr/bin and relatedly /lib into /usr/lib . Sometimes also (/usr)/sbin to /usr/bin (Arch Linux). So /usr is expected to be available at the same time as / .

The distinction between the two hierarchies is taken to be unnecessary complexity now. The idea was once having only /bin available at boot, but having an initial ramdisk makes this obsolete.

I know of Fedora Linux (2011) and Arch Linux (2012) going this way and Solaris is doing this for a long time (> 15 years).

xenoterracide ,Jan 17, 2011 at 16:23

On Linux /bin and /usr/bin are still separate because it is common to have /usr on a separate partition (although this configuration breaks in subtle ways, sometimes). In /bin is all the commands that you will need if you only have / mounted.

On Solaris and Arch Linux (and probably others) /bin is a symlink to /usr/bin . Arch also has /sbin and /usr/sbin symlinked to /usr/bin .

Of particular note, the statement that /bin is for "system administrator" commands and /usr/bin is for user commands is not true (unless you think that bash and ls are for admins only, in which case you have a lot to learn). Administrator commands are in /sbin and /usr/sbin .

[Feb 21, 2019] The rm='rm -i' alias is an horror

Feb 21, 2019 | superuser.com

The rm='rm -i' alias is an horror because after a while using it, you will expect rm to prompt you by default before removing files. Of course, one day you'll run it with an account that hasn't that alias set and before you understand what's going on, it is too late.

... ... ...

If you want save aliases, but don't want to risk getting used to the commands working differently on your system than on others, you can to disable rm like this
alias rm='echo "rm is disabled, use remove or trash or /bin/rm instead."'

Then you can create your own safe alias, e.g.

alias remove='/bin/rm -irv'

or use trash instead.

[Feb 21, 2019] What is the minimum I have to do to create an RPM file?

Feb 21, 2019 | stackoverflow.com

webwesen ,Jan 29, 2016 at 6:42

I just want to create an RPM file to distribute my Linux binary "foobar", with only a couple of dependencies. It has a config file, /etc/foobar.conf and should be installed in /usr/bin/foobar.

Unfortunately the documentation for RPM is 27 chapters long and I really don't have a day to sit down and read this, because I am also busy making .deb and EXE installers for other platforms.

What is the absolute minimum I have to do to create an RPM? Assume the foobar binary and foobar.conf are in the current working directory.

icasimpan ,Apr 10, 2018 at 13:33

I often do binary rpm per packaging proprietary apps - also moster as websphere - on linux. So my experience could be useful also a you, besides that it would better to do a TRUE RPM if you can. But i digress.

So the a basic step for packaging your (binary) program is as follow - in which i suppose the program is toybinprog with version 1.0, have a conf to be installed in /etc/toybinprog/toybinprog.conf and have a bin to be installed in /usr/bin called tobinprog :

1. create your rpm build env for RPM < 4.6,4.7
mkdir -p ~/rpmbuild/{RPMS,SRPMS,BUILD,SOURCES,SPECS,tmp}

cat <<EOF >~/.rpmmacros
%_topdir   %(echo $HOME)/rpmbuild
%_tmppath  %{_topdir}/tmp
EOF

cd ~/rpmbuild
2. create the tarball of your project
mkdir toybinprog-1.0
mkdir -p toybinprog-1.0/usr/bin
mkdir -p toybinprog-1.0/etc/toybinprog
install -m 755 toybinprog toybinprog-1.0/usr/bin
install -m 644 toybinprog.conf toybinprog-1.0/etc/toybinprog/

tar -zcvf toybinprog-1.0.tar.gz toybinprog-1.0/
3. Copy to the sources dir
cp toybinprog-1.0.tar.gz SOURCES/

cat <<EOF > SPECS/toybinprog.spec
# Don't try fancy stuff like debuginfo, which is useless on binary-only
# packages. Don't strip binary too
# Be sure buildpolicy set to do nothing
%define        __spec_install_post %{nil}
%define          debug_package %{nil}
%define        __os_install_post %{_dbpath}/brp-compress

Summary: A very simple toy bin rpm package
Name: toybinprog
Version: 1.0
Release: 1
License: GPL+
Group: Development/Tools
SOURCE0 : %{name}-%{version}.tar.gz
URL: http://toybinprog.company.com/

BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root

%description
%{summary}

%prep
%setup -q

%build
# Empty section.

%install
rm -rf %{buildroot}
mkdir -p  %{buildroot}

# in builddir
cp -a * %{buildroot}


%clean
rm -rf %{buildroot}


%files
%defattr(-,root,root,-)
%config(noreplace) %{_sysconfdir}/%{name}/%{name}.conf
%{_bindir}/*

%changelog
* Thu Apr 24 2009  Elia Pinto <devzero2000@rpm5.org> 1.0-1
- First Build

EOF
4. build the source and the binary rpm
rpmbuild -ba SPECS/toybinprog.spec

And that's all.

Hope this help

> ,

As an application distributor, fpm sounds perfect for your needs . There is an example here which shows how to package an app from source. FPM can produce both deb files and RPM files.

[Feb 21, 2019] perl - How to prompt for input and exit if the user entered an empty string - Stack Overflow

Feb 20, 2019 | stackoverflow.com

NewLearner ,Mar 12, 2012 at 3:22

I'm new to Perl and I'm writing a program where I want to force the user to enter a word. If the user enters an empty string then the program should exit.

This is what I have so far:

print "Enter a word to look up: ";

chomp ($usrword = <STDIN>);

DVK , Nov 19, 2015 at 19:11

You're almost there.
print "Enter a word to look up: ";
my $userword = <STDIN>; # I moved chomp to a new line to make it more readable
chomp $userword; # Get rid of newline character at the end
exit 0 if ($userword eq ""); # If empty string, exit.

Pondy , Jul 6 '16 at 22:11

File output is buffered by default. Since the prompt is so short, it is still sitting in the output buffer. You can disable buffering on STDOUT by adding this line of code before printing...
select((select(STDOUT), $|=1)[0]);

[Feb 11, 2019] Resuming rsync on a interrupted transfer

May 15, 2013 | stackoverflow.com

Glitches , May 15, 2013 at 18:06

I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning.

Here is my command:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"

When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 .

Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume.

How do I fix this so I don't have to manually intervene each time?

Richard Michael , Nov 6, 2013 at 4:26

TL;DR : Use --timeout=X (X in seconds) to change the default rsync server timeout, not --inplace .

The issue is the rsync server processes (of which there are two, see rsync --server ... in ps output on the receiver) continue running, to wait for the rsync client to send data.

If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9 ), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

JamesTheAwesomeDude , Dec 29, 2013 at 16:50

Just curious: wouldn't SIGINT (aka ^C ) be 'politer' than SIGTERM ? – JamesTheAwesomeDude Dec 29 '13 at 16:50

Richard Michael , Dec 29, 2013 at 22:34

I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives SIGINT via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) – Richard Michael Dec 29 '13 at 22:34

d-b , Feb 3, 2015 at 8:48

I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/ but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? – d-b Feb 3 '15 at 8:48

Cees Timmerman , Sep 15, 2015 at 17:10

@user23122 --checksum reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. – Cees Timmerman Sep 15 '15 at 17:10

[Jan 29, 2019] Do journaling filesystems guarantee against corruption after a power failure

Jan 29, 2019 | unix.stackexchange.com

Nathan Osman ,May 6, 2011 at 1:50

I am asking this question on behalf of another user who raised the issue in the Ubuntu chat room.

Do journaling filesystems guarantee that no corruption will occur if a power failure occurs?

If this answer depends on the filesystem, please indicate which ones do protect against corruption and which ones don't.

Andrew Lambert ,May 6, 2011 at 2:51

There are no guarantees. A Journaling File System is more resilient and is less prone to corruption, but not immune.

All a journal is is a list of operations which have recently been done to the file system. The crucial part is that the journal entry is made before the operations take place. Most operations have multiple steps. Deleting a file, for example might entail deleting the file's entry in the file system's table of contents and then marking the sectors on the drive as free. If something happens between the two steps, a journaled file system can tell immediately and perform the necessary clean up to keep everything consistent. This is not the case with a non-journaled file system which has to look at the entire contents of the volume to find errors.

While this journaling is much less prone to corruption than not journaling, corruption can still occur. For example, if the hard drive is mechanically malfunctioning or if writes to the journal itself are failing or interrupted.

The basic premise of journaling is that writing a journal entry is much quicker, usually, than the actual transaction it describes will be. So, the period between the OS ordering a (journal) write and the hard drive fulfilling it is much shorter than for a normal write: a narrower window for things to go wrong in, but there's still a window.

Further reading

Nathan Osman ,May 6, 2011 at 2:57

Could you please elaborate a little bit on why this is true? Perhaps you could give an example of how corruption would occur in a certain scenario. – Nathan Osman May 6 '11 at 2:57

Andrew Lambert ,May 6, 2011 at 3:21

@George Edison See my expanded answer. – Andrew Lambert May 6 '11 at 3:21

psusi ,May 6, 2011 at 17:58

That last bit is incorrect; there is no window for things to go wrong. Since it records what it is about to do before it starts doing it, the operation can be restarted after the power failure, no matter at what point it occurs during the operation. It is a matter of ordering, not timing. – psusi May 6 '11 at 17:58

Andrew Lambert ,May 6, 2011 at 21:23

@psusi there is still a window for the write to the journal to be interrupted. Journal writes may appear atomic to the OS but they're still writes to the disk. – Andrew Lambert May 6 '11 at 21:23

psusi ,May 7, 2011 at 1:57

@Amazed they are atomic because they have sequence numbers and/or checksums, so the journal entry is either written entirely, or not. If it is not written entirely, it is simply ignored after the system restarts, and no further changes were made to the fs so it remains consistent. – psusi May 7 '11 at 1:57

Mikel ,May 6, 2011 at 6:03

No.

The most common type of journaling, called metadata journaling, only protects the integrity of the file system, not of data. This includes xfs , and ext3 / ext4 in the default data=ordered mode.

If a non-journaling file system suffers a crash, it will be checked using fsck on the next boot. fsck scans every inode on the file system, looking for blocks that are marked as used but are not reachable (i.e. have no file name), and marks those blocks as unused. Doing this takes a long time.

With a metadata journaling file system, instead of doing an fsck , it knows which blocks it was in the middle of changing, so it can mark them as free without searching the whole partition for them.

There is a less common type of journaling, called data journaling, which is what ext3 does if you mount it with the data=journal option.

It attempts to protect all your data by writing not just a list of logical operations, but also the entire contents of each write to the journal. But because it's writing your data twice, it can be much slower.

As others have pointed out, even this is not a guarantee, because the hard drive might have told the operating system it had stored the data, when it fact it was still in the hard drive's cache.

For more information, take a look at the Wikipedia Journaling File System article and the Data Mode section of the ext4 documentation .

SplinterReality ,May 6, 2011 at 8:03

+1 for the distinction between file system corruption and data corruption. That little distinction is quite the doozy in practice. – SplinterReality May 6 '11 at 8:03

boehj ,May 6, 2011 at 10:57

Excuse my utter ignorance, but doesn't data=journal as a feature make no sense at all? – boehj May 6 '11 at 10:57

psusi ,May 6, 2011 at 18:11

Again, the OS knows when the drive caches data and forces it to flush it when needed in order to maintain a coherent fs. Your data file of course, can be lost or corrupted if the application that was writing it when the power failed was not doing so carefully, and that applies whether or not you use data=journal. – psusi May 6 '11 at 18:11

user3338098 ,Aug 1, 2016 at 16:30

@psusi doesn't matter how careful the program is in writing the data, plenty of hard drives silently corrupt the data on READING stackoverflow.com/q/34141117/3338098user3338098 Aug 1 '16 at 16:30

psusi ,Aug 21, 2016 at 3:22

@user3338098, drives that silently corrupt data are horribly broken and should not ever be used, and are an entirely different conversation than corruption caused by software doing the wrong thing. – psusi Aug 21 '16 at 3:22

camh ,May 6, 2011 at 3:26

A filesystem cannot guarantee the consistency of its filesystem if a power failure occurs, because it does not know what the hardware will do.

If a hard drive buffers data for write but tells the OS that it has written the data and does not support the appropriate write barriers, then out-of-order writes can occur where an earlier write has not hit the platter, but a later one has. See this serverfault answer for more details.

Also, the position of the head on a magnetic HDD is controlled with electro-magnets. If power fails in the middle of a write, it is possible for some data to continue to be written while the heads move, corrupting data on blocks that the filesystem never intended to be written.

Nathan Osman ,May 6, 2011 at 6:43

Isn't the drive's firmware smart enough to suspend writing when retracting the head? – Nathan Osman May 6 '11 at 6:43

camh ,May 6, 2011 at 7:54

@George: It's going to depend on the drive. There's a lot out there and you don't know how well your (cheap) drive does things. – camh May 6 '11 at 7:54

psusi ,May 6, 2011 at 18:05

The hard drive tells the OS if it uses a write behind cache, and the OS takes measures to ensure they are flushed in the correct order. Also drives are designed so that when the power fails, they stop writing. I have seen some cases where the sector being written at the time of power loss becomes corrupt because it did not finish updating the ecc ( but can be easily re-written correctly ), but never heard of random sectors being corrupted on power loss. – psusi May 6 '11 at 18:05

jlliagre ,May 6, 2011 at 8:35

ZFS, which is close but not exactly a journaling filesystem, is guaranteeing by design against corruption after a power failure.

It doesn't matter if an ongoing write is interrupted in the middle as in such case, its checksum will be certainly incorrect so the block will be ignored. As the file system is copy on write, the previous correct data (or meta-data) is still on disk and will be used instead.

sakisk ,May 6, 2011 at 10:13

The answer is in most cases no:

Nathan Osman ,May 6, 2011 at 16:35

What events could lead to a corrupt journal? The only thing I could think of was bad sectors - is there anything else? – Nathan Osman May 6 '11 at 16:35

sakisk ,May 7, 2011 at 13:21

That's right, hardware failures are the usual case. – sakisk May 7 '11 at 13:21

[Jan 29, 2019] Split string into an array in Bash

May 14, 2012 | stackoverflow.com

Lgn ,May 14, 2012 at 15:15

In a Bash script I would like to split a line into pieces and store them in an array.

The line:

Paris, France, Europe

I would like to have them in an array like this:

array[0] = Paris
array[1] = France
array[2] = Europe

I would like to use simple code, the command's speed doesn't matter. How can I do it?

antak ,Jun 18, 2018 at 9:22

This is #1 Google hit but there's controversy in the answer because the question unfortunately asks about delimiting on , (comma-space) and not a single character such as comma. If you're only interested in the latter, answers here are easier to follow: stackoverflow.com/questions/918886/ – antak Jun 18 '18 at 9:22

Dennis Williamson ,May 14, 2012 at 15:16

IFS=', ' read -r -a array <<< "$string"

Note that the characters in $IFS are treated individually as separators so that in this case fields may be separated by either a comma or a space rather than the sequence of the two characters. Interestingly though, empty fields aren't created when comma-space appears in the input because the space is treated specially.

To access an individual element:

echo "${array[0]}"

To iterate over the elements:

for element in "${array[@]}"
do
    echo "$element"
done

To get both the index and the value:

for index in "${!array[@]}"
do
    echo "$index ${array[index]}"
done

The last example is useful because Bash arrays are sparse. In other words, you can delete an element or add an element and then the indices are not contiguous.

unset "array[1]"
array[42]=Earth

To get the number of elements in an array:

echo "${#array[@]}"

As mentioned above, arrays can be sparse so you shouldn't use the length to get the last element. Here's how you can in Bash 4.2 and later:

echo "${array[-1]}"

in any version of Bash (from somewhere after 2.05b):

echo "${array[@]: -1:1}"

Larger negative offsets select farther from the end of the array. Note the space before the minus sign in the older form. It is required.

l0b0 ,May 14, 2012 at 15:24

Just use IFS=', ' , then you don't have to remove the spaces separately. Test: IFS=', ' read -a array <<< "Paris, France, Europe"; echo "${array[@]}" – l0b0 May 14 '12 at 15:24

Dennis Williamson ,May 14, 2012 at 16:33

@l0b0: Thanks. I don't know what I was thinking. I like to use declare -p array for test output, by the way. – Dennis Williamson May 14 '12 at 16:33

Nathan Hyde ,Mar 16, 2013 at 21:09

@Dennis Williamson - Awesome, thorough answer. – Nathan Hyde Mar 16 '13 at 21:09

dsummersl ,Aug 9, 2013 at 14:06

MUCH better than multiple cut -f calls! – dsummersl Aug 9 '13 at 14:06

caesarsol ,Oct 29, 2015 at 14:45

Warning: the IFS variable means split by one of these characters , so it's not a sequence of chars to split by. IFS=', ' read -a array <<< "a,d r s,w" => ${array[*]} == a d r s w – caesarsol Oct 29 '15 at 14:45

Jim Ho ,Mar 14, 2013 at 2:20

Here is a way without setting IFS:
string="1:2:3:4:5"
set -f                      # avoid globbing (expansion of *).
array=(${string//:/ })
for i in "${!array[@]}"
do
    echo "$i=>${array[i]}"
done

The idea is using string replacement:

${string//substring/replacement}

to replace all matches of $substring with white space and then using the substituted string to initialize a array:

(element1 element2 ... elementN)

Note: this answer makes use of the split+glob operator . Thus, to prevent expansion of some characters (such as * ) it is a good idea to pause globbing for this script.

Werner Lehmann ,May 4, 2013 at 22:32

Used this approach... until I came across a long string to split. 100% CPU for more than a minute (then I killed it). It's a pity because this method allows to split by a string, not some character in IFS. – Werner Lehmann May 4 '13 at 22:32

Dieter Gribnitz ,Sep 2, 2014 at 15:46

WARNING: Just ran into a problem with this approach. If you have an element named * you will get all the elements of your cwd as well. thus string="1:2:3:4:*" will give some unexpected and possibly dangerous results depending on your implementation. Did not get the same error with (IFS=', ' read -a array <<< "$string") and this one seems safe to use. – Dieter Gribnitz Sep 2 '14 at 15:46

akostadinov ,Nov 6, 2014 at 14:31

not reliable for many kinds of values, use with care – akostadinov Nov 6 '14 at 14:31

Andrew White ,Jun 1, 2016 at 11:44

quoting ${string//:/ } prevents shell expansion – Andrew White Jun 1 '16 at 11:44

Mark Thomson ,Jun 5, 2016 at 20:44

I had to use the following on OSX: array=(${string//:/ }) – Mark Thomson Jun 5 '16 at 20:44

bgoldst ,Jul 19, 2017 at 21:20

All of the answers to this question are wrong in one way or another.

Wrong answer #1

IFS=', ' read -r -a array <<< "$string"

1: This is a misuse of $IFS . The value of the $IFS variable is not taken as a single variable-length string separator, rather it is taken as a set of single-character string separators, where each field that read splits off from the input line can be terminated by any character in the set (comma or space, in this example).

Actually, for the real sticklers out there, the full meaning of $IFS is slightly more involved. From the bash manual :

The shell treats each character of IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators. If IFS is unset, or its value is exactly <space><tab><newline> , the default, then sequences of <space> , <tab> , and <newline> at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words. If IFS has a value other than the default, then sequences of the whitespace characters <space> , <tab> , and <newline> are ignored at the beginning and end of the word, as long as the whitespace character is in the value of IFS (an IFS whitespace character). Any character in IFS that is not IFS whitespace, along with any adjacent IFS whitespace characters, delimits a field. A sequence of IFS whitespace characters is also treated as a delimiter. If the value of IFS is null, no word splitting occurs.

Basically, for non-default non-null values of $IFS , fields can be separated with either (1) a sequence of one or more characters that are all from the set of "IFS whitespace characters" (that is, whichever of <space> , <tab> , and <newline> ("newline" meaning line feed (LF) ) are present anywhere in $IFS ), or (2) any non-"IFS whitespace character" that's present in $IFS along with whatever "IFS whitespace characters" surround it in the input line.

For the OP, it's possible that the second separation mode I described in the previous paragraph is exactly what he wants for his input string, but we can be pretty confident that the first separation mode I described is not correct at all. For example, what if his input string was 'Los Angeles, United States, North America' ?

IFS=', ' read -ra a <<<'Los Angeles, United States, North America'; declare -p a;
## declare -a a=([0]="Los" [1]="Angeles" [2]="United" [3]="States" [4]="North" [5]="America")

2: Even if you were to use this solution with a single-character separator (such as a comma by itself, that is, with no following space or other baggage), if the value of the $string variable happens to contain any LFs, then read will stop processing once it encounters the first LF. The read builtin only processes one line per invocation. This is true even if you are piping or redirecting input only to the read statement, as we are doing in this example with the here-string mechanism, and thus unprocessed input is guaranteed to be lost. The code that powers the read builtin has no knowledge of the data flow within its containing command structure.

You could argue that this is unlikely to cause a problem, but still, it's a subtle hazard that should be avoided if possible. It is caused by the fact that the read builtin actually does two levels of input splitting: first into lines, then into fields. Since the OP only wants one level of splitting, this usage of the read builtin is not appropriate, and we should avoid it.

3: A non-obvious potential issue with this solution is that read always drops the trailing field if it is empty, although it preserves empty fields otherwise. Here's a demo:

string=', , a, , b, c, , , '; IFS=', ' read -ra a <<<"$string"; declare -p a;
## declare -a a=([0]="" [1]="" [2]="a" [3]="" [4]="b" [5]="c" [6]="" [7]="")

Maybe the OP wouldn't care about this, but it's still a limitation worth knowing about. It reduces the robustness and generality of the solution.

This problem can be solved by appending a dummy trailing delimiter to the input string just prior to feeding it to read , as I will demonstrate later.


Wrong answer #2

string="1:2:3:4:5"
set -f                     # avoid globbing (expansion of *).
array=(${string//:/ })

Similar idea:

t="one,two,three"
a=($(echo $t | tr ',' "\n"))

(Note: I added the missing parentheses around the command substitution which the answerer seems to have omitted.)

Similar idea:

string="1,2,3,4"
array=(`echo $string | sed 's/,/\n/g'`)

These solutions leverage word splitting in an array assignment to split the string into fields. Funnily enough, just like read , general word splitting also uses the $IFS special variable, although in this case it is implied that it is set to its default value of <space><tab><newline> , and therefore any sequence of one or more IFS characters (which are all whitespace characters now) is considered to be a field delimiter.

This solves the problem of two levels of splitting committed by read , since word splitting by itself constitutes only one level of splitting. But just as before, the problem here is that the individual fields in the input string can already contain $IFS characters, and thus they would be improperly split during the word splitting operation. This happens to not be the case for any of the sample input strings provided by these answerers (how convenient...), but of course that doesn't change the fact that any code base that used this idiom would then run the risk of blowing up if this assumption were ever violated at some point down the line. Once again, consider my counterexample of 'Los Angeles, United States, North America' (or 'Los Angeles:United States:North America' ).

Also, word splitting is normally followed by filename expansion ( aka pathname expansion aka globbing), which, if done, would potentially corrupt words containing the characters * , ? , or [ followed by ] (and, if extglob is set, parenthesized fragments preceded by ? , * , + , @ , or ! ) by matching them against file system objects and expanding the words ("globs") accordingly. The first of these three answerers has cleverly undercut this problem by running set -f beforehand to disable globbing. Technically this works (although you should probably add set +f afterward to reenable globbing for subsequent code which may depend on it), but it's undesirable to have to mess with global shell settings in order to hack a basic string-to-array parsing operation in local code.

Another issue with this answer is that all empty fields will be lost. This may or may not be a problem, depending on the application.

Note: If you're going to use this solution, it's better to use the ${string//:/ } "pattern substitution" form of parameter expansion , rather than going to the trouble of invoking a command substitution (which forks the shell), starting up a pipeline, and running an external executable ( tr or sed ), since parameter expansion is purely a shell-internal operation. (Also, for the tr and sed solutions, the input variable should be double-quoted inside the command substitution; otherwise word splitting would take effect in the echo command and potentially mess with the field values. Also, the $(...) form of command substitution is preferable to the old `...` form since it simplifies nesting of command substitutions and allows for better syntax highlighting by text editors.)


Wrong answer #3

str="a, b, c, d"  # assuming there is a space after ',' as in Q
arr=(${str//,/})  # delete all occurrences of ','

This answer is almost the same as #2 . The difference is that the answerer has made the assumption that the fields are delimited by two characters, one of which being represented in the default $IFS , and the other not. He has solved this rather specific case by removing the non-IFS-represented character using a pattern substitution expansion and then using word splitting to split the fields on the surviving IFS-represented delimiter character.

This is not a very generic solution. Furthermore, it can be argued that the comma is really the "primary" delimiter character here, and that stripping it and then depending on the space character for field splitting is simply wrong. Once again, consider my counterexample: 'Los Angeles, United States, North America' .

Also, again, filename expansion could corrupt the expanded words, but this can be prevented by temporarily disabling globbing for the assignment with set -f and then set +f .

Also, again, all empty fields will be lost, which may or may not be a problem depending on the application.


Wrong answer #4

string='first line
second line
third line'

oldIFS="$IFS"
IFS='
'
IFS=${IFS:0:1} # this is useful to format your code with tabs
lines=( $string )
IFS="$oldIFS"

This is similar to #2 and #3 in that it uses word splitting to get the job done, only now the code explicitly sets $IFS to contain only the single-character field delimiter present in the input string. It should be repeated that this cannot work for multicharacter field delimiters such as the OP's comma-space delimiter. But for a single-character delimiter like the LF used in this example, it actually comes close to being perfect. The fields cannot be unintentionally split in the middle as we saw with previous wrong answers, and there is only one level of splitting, as required.

One problem is that filename expansion will corrupt affected words as described earlier, although once again this can be solved by wrapping the critical statement in set -f and set +f .

Another potential problem is that, since LF qualifies as an "IFS whitespace character" as defined earlier, all empty fields will be lost, just as in #2 and #3 . This would of course not be a problem if the delimiter happens to be a non-"IFS whitespace character", and depending on the application it may not matter anyway, but it does vitiate the generality of the solution.

So, to sum up, assuming you have a one-character delimiter, and it is either a non-"IFS whitespace character" or you don't care about empty fields, and you wrap the critical statement in set -f and set +f , then this solution works, but otherwise not.

(Also, for information's sake, assigning a LF to a variable in bash can be done more easily with the $'...' syntax, e.g. IFS=$'\n'; .)


Wrong answer #5

countries='Paris, France, Europe'
OIFS="$IFS"
IFS=', ' array=($countries)
IFS="$OIFS"

Similar idea:

IFS=', ' eval 'array=($string)'

This solution is effectively a cross between #1 (in that it sets $IFS to comma-space) and #2-4 (in that it uses word splitting to split the string into fields). Because of this, it suffers from most of the problems that afflict all of the above wrong answers, sort of like the worst of all worlds.

Also, regarding the second variant, it may seem like the eval call is completely unnecessary, since its argument is a single-quoted string literal, and therefore is statically known. But there's actually a very non-obvious benefit to using eval in this way. Normally, when you run a simple command which consists of a variable assignment only , meaning without an actual command word following it, the assignment takes effect in the shell environment:

IFS=', '; ## changes $IFS in the shell environment

This is true even if the simple command involves multiple variable assignments; again, as long as there's no command word, all variable assignments affect the shell environment:

IFS=', ' array=($countries); ## changes both $IFS and $array in the shell environment

But, if the variable assignment is attached to a command name (I like to call this a "prefix assignment") then it does not affect the shell environment, and instead only affects the environment of the executed command, regardless whether it is a builtin or external:

IFS=', ' :; ## : is a builtin command, the $IFS assignment does not outlive it
IFS=', ' env; ## env is an external command, the $IFS assignment does not outlive it

Relevant quote from the bash manual :

If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment.

It is possible to exploit this feature of variable assignment to change $IFS only temporarily, which allows us to avoid the whole save-and-restore gambit like that which is being done with the $OIFS variable in the first variant. But the challenge we face here is that the command we need to run is itself a mere variable assignment, and hence it would not involve a command word to make the $IFS assignment temporary. You might think to yourself, well why not just add a no-op command word to the statement like the : builtin to make the $IFS assignment temporary? This does not work because it would then make the $array assignment temporary as well:

IFS=', ' array=($countries) :; ## fails; new $array value never escapes the : command

So, we're effectively at an impasse, a bit of a catch-22. But, when eval runs its code, it runs it in the shell environment, as if it was normal, static source code, and therefore we can run the $array assignment inside the eval argument to have it take effect in the shell environment, while the $IFS prefix assignment that is prefixed to the eval command will not outlive the eval command. This is exactly the trick that is being used in the second variant of this solution:

IFS=', ' eval 'array=($string)'; ## $IFS does not outlive the eval command, but $array does

So, as you can see, it's actually quite a clever trick, and accomplishes exactly what is required (at least with respect to assignment effectation) in a rather non-obvious way. I'm actually not against this trick in general, despite the involvement of eval ; just be careful to single-quote the argument string to guard against security threats.

But again, because of the "worst of all worlds" agglomeration of problems, this is still a wrong answer to the OP's requirement.


Wrong answer #6

IFS=', '; array=(Paris, France, Europe)

IFS=' ';declare -a array=(Paris France Europe)

Um... what? The OP has a string variable that needs to be parsed into an array. This "answer" starts with the verbatim contents of the input string pasted into an array literal. I guess that's one way to do it.

It looks like the answerer may have assumed that the $IFS variable affects all bash parsing in all contexts, which is not true. From the bash manual:

IFS The Internal Field Separator that is used for word splitting after expansion and to split lines into words with the read builtin command. The default value is <space><tab><newline> .

So the $IFS special variable is actually only used in two contexts: (1) word splitting that is performed after expansion (meaning not when parsing bash source code) and (2) for splitting input lines into words by the read builtin.

Let me try to make this clearer. I think it might be good to draw a distinction between parsing and execution . Bash must first parse the source code, which obviously is a parsing event, and then later it executes the code, which is when expansion comes into the picture. Expansion is really an execution event. Furthermore, I take issue with the description of the $IFS variable that I just quoted above; rather than saying that word splitting is performed after expansion , I would say that word splitting is performed during expansion, or, perhaps even more precisely, word splitting is part of the expansion process. The phrase "word splitting" refers only to this step of expansion; it should never be used to refer to the parsing of bash source code, although unfortunately the docs do seem to throw around the words "split" and "words" a lot. Here's a relevant excerpt from the linux.die.net version of the bash manual:

Expansion is performed on the command line after it has been split into words. There are seven kinds of expansion performed: brace expansion , tilde expansion , parameter and variable expansion , command substitution , arithmetic expansion , word splitting , and pathname expansion .

The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.

You could argue the GNU version of the manual does slightly better, since it opts for the word "tokens" instead of "words" in the first sentence of the Expansion section:

Expansion is performed on the command line after it has been split into tokens.

The important point is, $IFS does not change the way bash parses source code. Parsing of bash source code is actually a very complex process that involves recognition of the various elements of shell grammar, such as command sequences, command lists, pipelines, parameter expansions, arithmetic substitutions, and command substitutions. For the most part, the bash parsing process cannot be altered by user-level actions like variable assignments (actually, there are some minor exceptions to this rule; for example, see the various compatxx shell settings , which can change certain aspects of parsing behavior on-the-fly). The upstream "words"/"tokens" that result from this complex parsing process are then expanded according to the general process of "expansion" as broken down in the above documentation excerpts, where word splitting of the expanded (expanding?) text into downstream words is simply one step of that process. Word splitting only touches text that has been spit out of a preceding expansion step; it does not affect literal text that was parsed right off the source bytestream.


Wrong answer #7

string='first line
        second line
        third line'

while read -r line; do lines+=("$line"); done <<<"$string"

This is one of the best solutions. Notice that we're back to using read . Didn't I say earlier that read is inappropriate because it performs two levels of splitting, when we only need one? The trick here is that you can call read in such a way that it effectively only does one level of splitting, specifically by splitting off only one field per invocation, which necessitates the cost of having to call it repeatedly in a loop. It's a bit of a sleight of hand, but it works.

But there are problems. First: When you provide at least one NAME argument to read , it automatically ignores leading and trailing whitespace in each field that is split off from the input string. This occurs whether $IFS is set to its default value or not, as described earlier in this post. Now, the OP may not care about this for his specific use-case, and in fact, it may be a desirable feature of the parsing behavior. But not everyone who wants to parse a string into fields will want this. There is a solution, however: A somewhat non-obvious usage of read is to pass zero NAME arguments. In this case, read will store the entire input line that it gets from the input stream in a variable named $REPLY , and, as a bonus, it does not strip leading and trailing whitespace from the value. This is a very robust usage of read which I've exploited frequently in my shell programming career. Here's a demonstration of the difference in behavior:

string=$'  a  b  \n  c  d  \n  e  f  '; ## input string

a=(); while read -r line; do a+=("$line"); done <<<"$string"; declare -p a;
## declare -a a=([0]="a  b" [1]="c  d" [2]="e  f") ## read trimmed surrounding whitespace

a=(); while read -r; do a+=("$REPLY"); done <<<"$string"; declare -p a;
## declare -a a=([0]="  a  b  " [1]="  c  d  " [2]="  e  f  ") ## no trimming

The second issue with this solution is that it does not actually address the case of a custom field separator, such as the OP's comma-space. As before, multicharacter separators are not supported, which is an unfortunate limitation of this solution. We could try to at least split on comma by specifying the separator to the -d option, but look what happens:

string='Paris, France, Europe';
a=(); while read -rd,; do a+=("$REPLY"); done <<<"$string"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France")

Predictably, the unaccounted surrounding whitespace got pulled into the field values, and hence this would have to be corrected subsequently through trimming operations (this could also be done directly in the while-loop). But there's another obvious error: Europe is missing! What happened to it? The answer is that read returns a failing return code if it hits end-of-file (in this case we can call it end-of-string) without encountering a final field terminator on the final field. This causes the while-loop to break prematurely and we lose the final field.

Technically this same error afflicted the previous examples as well; the difference there is that the field separator was taken to be LF, which is the default when you don't specify the -d option, and the <<< ("here-string") mechanism automatically appends a LF to the string just before it feeds it as input to the command. Hence, in those cases, we sort of accidentally solved the problem of a dropped final field by unwittingly appending an additional dummy terminator to the input. Let's call this solution the "dummy-terminator" solution. We can apply the dummy-terminator solution manually for any custom delimiter by concatenating it against the input string ourselves when instantiating it in the here-string:

a=(); while read -rd,; do a+=("$REPLY"); done <<<"$string,"; declare -p a;
declare -a a=([0]="Paris" [1]=" France" [2]=" Europe")

There, problem solved. Another solution is to only break the while-loop if both (1) read returned failure and (2) $REPLY is empty, meaning read was not able to read any characters prior to hitting end-of-file. Demo:

a=(); while read -rd,|| [[ -n "$REPLY" ]]; do a+=("$REPLY"); done <<<"$string"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=$' Europe\n')

This approach also reveals the secretive LF that automatically gets appended to the here-string by the <<< redirection operator. It could of course be stripped off separately through an explicit trimming operation as described a moment ago, but obviously the manual dummy-terminator approach solves it directly, so we could just go with that. The manual dummy-terminator solution is actually quite convenient in that it solves both of these two problems (the dropped-final-field problem and the appended-LF problem) in one go.

So, overall, this is quite a powerful solution. It's only remaining weakness is a lack of support for multicharacter delimiters, which I will address later.


Wrong answer #8

string='first line
        second line
        third line'

readarray -t lines <<<"$string"

(This is actually from the same post as #7 ; the answerer provided two solutions in the same post.)

The readarray builtin, which is a synonym for mapfile , is ideal. It's a builtin command which parses a bytestream into an array variable in one shot; no messing with loops, conditionals, substitutions, or anything else. And it doesn't surreptitiously strip any whitespace from the input string. And (if -O is not given) it conveniently clears the target array before assigning to it. But it's still not perfect, hence my criticism of it as a "wrong answer".

First, just to get this out of the way, note that, just like the behavior of read when doing field-parsing, readarray drops the trailing field if it is empty. Again, this is probably not a concern for the OP, but it could be for some use-cases. I'll come back to this in a moment.

Second, as before, it does not support multicharacter delimiters. I'll give a fix for this in a moment as well.

Third, the solution as written does not parse the OP's input string, and in fact, it cannot be used as-is to parse it. I'll expand on this momentarily as well.

For the above reasons, I still consider this to be a "wrong answer" to the OP's question. Below I'll give what I consider to be the right answer.


Right answer

Here's a naοve attempt to make #8 work by just specifying the -d option:

string='Paris, France, Europe';
readarray -td, a <<<"$string"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=$' Europe\n')

We see the result is identical to the result we got from the double-conditional approach of the looping read solution discussed in #7 . We can almost solve this with the manual dummy-terminator trick:

readarray -td, a <<<"$string,"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=" Europe" [3]=$'\n')

The problem here is that readarray preserved the trailing field, since the <<< redirection operator appended the LF to the input string, and therefore the trailing field was not empty (otherwise it would've been dropped). We can take care of this by explicitly unsetting the final array element after-the-fact:

readarray -td, a <<<"$string,"; unset 'a[-1]'; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=" Europe")

The only two problems that remain, which are actually related, are (1) the extraneous whitespace that needs to be trimmed, and (2) the lack of support for multicharacter delimiters.

The whitespace could of course be trimmed afterward (for example, see How to trim whitespace from a Bash variable? ). But if we can hack a multicharacter delimiter, then that would solve both problems in one shot.

Unfortunately, there's no direct way to get a multicharacter delimiter to work. The best solution I've thought of is to preprocess the input string to replace the multicharacter delimiter with a single-character delimiter that will be guaranteed not to collide with the contents of the input string. The only character that has this guarantee is the NUL byte . This is because, in bash (though not in zsh, incidentally), variables cannot contain the NUL byte. This preprocessing step can be done inline in a process substitution. Here's how to do it using awk :

readarray -td '' a < <(awk '{ gsub(/, /,"\0"); print; }' <<<"$string, "); unset 'a[-1]';
declare -p a;
## declare -a a=([0]="Paris" [1]="France" [2]="Europe")

There, finally! This solution will not erroneously split fields in the middle, will not cut out prematurely, will not drop empty fields, will not corrupt itself on filename expansions, will not automatically strip leading and trailing whitespace, will not leave a stowaway LF on the end, does not require loops, and does not settle for a single-character delimiter.


Trimming solution

Lastly, I wanted to demonstrate my own fairly intricate trimming solution using the obscure -C callback option of readarray . Unfortunately, I've run out of room against Stack Overflow's draconian 30,000 character post limit, so I won't be able to explain it. I'll leave that as an exercise for the reader.

function mfcb { local val="$4"; "$1"; eval "$2[$3]=\$val;"; };
function val_ltrim { if [[ "$val" =~ ^[[:space:]]+ ]]; then val="${val:${#BASH_REMATCH[0]}}"; fi; };
function val_rtrim { if [[ "$val" =~ [[:space:]]+$ ]]; then val="${val:0:${#val}-${#BASH_REMATCH[0]}}"; fi; };
function val_trim { val_ltrim; val_rtrim; };
readarray -c1 -C 'mfcb val_trim a' -td, <<<"$string,"; unset 'a[-1]'; declare -p a;
## declare -a a=([0]="Paris" [1]="France" [2]="Europe")

fbicknel ,Aug 18, 2017 at 15:57

It may also be helpful to note (though understandably you had no room to do so) that the -d option to readarray first appears in Bash 4.4. – fbicknel Aug 18 '17 at 15:57

Cyril Duchon-Doris ,Nov 3, 2017 at 9:16

You should add a "TL;DR : scroll 3 pages to see the right solution at the end of my answer" – Cyril Duchon-Doris Nov 3 '17 at 9:16

dawg ,Nov 26, 2017 at 22:28

Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,"\0"); print }' and eliminate that concatenation of the final ", " then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,"\0"); print; }' <<<"$string") on Bash that supports readarray . Note your method is Bash 4.4+ I think because of the -d in readarray – dawg Nov 26 '17 at 22:28

datUser ,Feb 22, 2018 at 14:54

Looks like readarray is not an available builtin on OSX. – datUser Feb 22 '18 at 14:54

bgoldst ,Feb 23, 2018 at 3:37

@datUser That's unfortunate. Your version of bash must be too old for readarray . In this case, you can use the second-best solution built on read . I'm referring to this: a=(); while read -rd,; do a+=("$REPLY"); done <<<"$string,"; (with the awk substitution if you need multicharacter delimiter support). Let me know if you run into any problems; I'm pretty sure this solution should work on fairly old versions of bash, back to version 2-something, released like two decades ago. – bgoldst Feb 23 '18 at 3:37

Jmoney38 ,Jul 14, 2015 at 11:54

t="one,two,three"
a=($(echo "$t" | tr ',' '\n'))
echo "${a[2]}"

Prints three

shrimpwagon ,Oct 16, 2015 at 20:04

I actually prefer this approach. Simple. – shrimpwagon Oct 16 '15 at 20:04

Ben ,Oct 31, 2015 at 3:11

I copied and pasted this and it did did not work with echo, but did work when I used it in a for loop. – Ben Oct 31 '15 at 3:11

Pinaki Mukherjee ,Nov 9, 2015 at 20:22

This is the simplest approach. thanks – Pinaki Mukherjee Nov 9 '15 at 20:22

abalter ,Aug 30, 2016 at 5:13

This does not work as stated. @Jmoney38 or shrimpwagon if you can paste this in a terminal and get the desired output, please paste the result here. – abalter Aug 30 '16 at 5:13

leaf ,Jul 17, 2017 at 16:28

@abalter Works for me with a=($(echo $t | tr ',' "\n")) . Same result with a=($(echo $t | tr ',' ' ')) . – leaf Jul 17 '17 at 16:28

Luca Borrione ,Nov 2, 2012 at 13:44

Sometimes it happened to me that the method described in the accepted answer didn't work, especially if the separator is a carriage return.
In those cases I solved in this way:
string='first line
second line
third line'

oldIFS="$IFS"
IFS='
'
IFS=${IFS:0:1} # this is useful to format your code with tabs
lines=( $string )
IFS="$oldIFS"

for line in "${lines[@]}"
    do
        echo "--> $line"
done

Stefan van den Akker ,Feb 9, 2015 at 16:52

+1 This completely worked for me. I needed to put multiple strings, divided by a newline, into an array, and read -a arr <<< "$strings" did not work with IFS=$'\n' . – Stefan van den Akker Feb 9 '15 at 16:52

Stefan van den Akker ,Feb 10, 2015 at 13:49

Here is the answer to make the accepted answer work when the delimiter is a newline . – Stefan van den Akker Feb 10 '15 at 13:49

,Jul 24, 2015 at 21:24

The accepted answer works for values in one line.
If the variable has several lines:
string='first line
        second line
        third line'

We need a very different command to get all lines:

while read -r line; do lines+=("$line"); done <<<"$string"

Or the much simpler bash readarray :

readarray -t lines <<<"$string"

Printing all lines is very easy taking advantage of a printf feature:

printf ">[%s]\n" "${lines[@]}"

>[first line]
>[        second line]
>[        third line]

Mayhem ,Dec 31, 2015 at 3:13

While not every solution works for every situation, your mention of readarray... replaced my last two hours with 5 minutes... you got my vote – Mayhem Dec 31 '15 at 3:13

Derek 朕會功夫 ,Mar 23, 2018 at 19:14

readarray is the right answer. – Derek 朕會功夫 Mar 23 '18 at 19:14

ssanch ,Jun 3, 2016 at 15:24

This is similar to the approach by Jmoney38, but using sed:
string="1,2,3,4"
array=(`echo $string | sed 's/,/\n/g'`)
echo ${array[0]}

Prints 1

dawg ,Nov 26, 2017 at 19:59

The key to splitting your string into an array is the multi character delimiter of ", " . Any solution using IFS for multi character delimiters is inherently wrong since IFS is a set of those characters, not a string.

If you assign IFS=", " then the string will break on EITHER "," OR " " or any combination of them which is not an accurate representation of the two character delimiter of ", " .

You can use awk or sed to split the string, with process substitution:

#!/bin/bash

str="Paris, France, Europe"
array=()
while read -r -d $'\0' each; do   # use a NUL terminated field separator 
    array+=("$each")
done < <(printf "%s" "$str" | awk '{ gsub(/,[ ]+|$/,"\0"); print }')
declare -p array
# declare -a array=([0]="Paris" [1]="France" [2]="Europe") output

It is more efficient to use a regex you directly in Bash:

#!/bin/bash

str="Paris, France, Europe"

array=()
while [[ $str =~ ([^,]+)(,[ ]+|$) ]]; do
    array+=("${BASH_REMATCH[1]}")   # capture the field
    i=${#BASH_REMATCH}              # length of field + delimiter
    str=${str:i}                    # advance the string by that length
done                                # the loop deletes $str, so make a copy if needed

declare -p array
# declare -a array=([0]="Paris" [1]="France" [2]="Europe") output...

With the second form, there is no sub shell and it will be inherently faster.


Edit by bgoldst: Here are some benchmarks comparing my readarray solution to dawg's regex solution, and I also included the read solution for the heck of it (note: I slightly modified the regex solution for greater harmony with my solution) (also see my comments below the post):

## competitors
function c_readarray { readarray -td '' a < <(awk '{ gsub(/, /,"\0"); print; };' <<<"$1, "); unset 'a[-1]'; };
function c_read { a=(); local REPLY=''; while read -r -d ''; do a+=("$REPLY"); done < <(awk '{ gsub(/, /,"\0"); print; };' <<<"$1, "); };
function c_regex { a=(); local s="$1, "; while [[ $s =~ ([^,]+),\  ]]; do a+=("${BASH_REMATCH[1]}"); s=${s:${#BASH_REMATCH}}; done; };

## helper functions
function rep {
    local -i i=-1;
    for ((i = 0; i<$1; ++i)); do
        printf %s "$2";
    done;
}; ## end rep()

function testAll {
    local funcs=();
    local args=();
    local func='';
    local -i rc=-1;
    while [[ "$1" != ':' ]]; do
        func="$1";
        if [[ ! "$func" =~ ^[_a-zA-Z][_a-zA-Z0-9]*$ ]]; then
            echo "bad function name: $func" >&2;
            return 2;
        fi;
        funcs+=("$func");
        shift;
    done;
    shift;
    args=("$@");
    for func in "${funcs[@]}"; do
        echo -n "$func ";
        { time $func "${args[@]}" >/dev/null 2>&1; } 2>&1| tr '\n' '/';
        rc=${PIPESTATUS[0]}; if [[ $rc -ne 0 ]]; then echo "[$rc]"; else echo; fi;
    done| column -ts/;
}; ## end testAll()

function makeStringToSplit {
    local -i n=$1; ## number of fields
    if [[ $n -lt 0 ]]; then echo "bad field count: $n" >&2; return 2; fi;
    if [[ $n -eq 0 ]]; then
        echo;
    elif [[ $n -eq 1 ]]; then
        echo 'first field';
    elif [[ "$n" -eq 2 ]]; then
        echo 'first field, last field';
    else
        echo "first field, $(rep $[$1-2] 'mid field, ')last field";
    fi;
}; ## end makeStringToSplit()

function testAll_splitIntoArray {
    local -i n=$1; ## number of fields in input string
    local s='';
    echo "===== $n field$(if [[ $n -ne 1 ]]; then echo 's'; fi;) =====";
    s="$(makeStringToSplit "$n")";
    testAll c_readarray c_read c_regex : "$s";
}; ## end testAll_splitIntoArray()

## results
testAll_splitIntoArray 1;
## ===== 1 field =====
## c_readarray   real  0m0.067s   user 0m0.000s   sys  0m0.000s
## c_read        real  0m0.064s   user 0m0.000s   sys  0m0.000s
## c_regex       real  0m0.000s   user 0m0.000s   sys  0m0.000s
##
testAll_splitIntoArray 10;
## ===== 10 fields =====
## c_readarray   real  0m0.067s   user 0m0.000s   sys  0m0.000s
## c_read        real  0m0.064s   user 0m0.000s   sys  0m0.000s
## c_regex       real  0m0.001s   user 0m0.000s   sys  0m0.000s
##
testAll_splitIntoArray 100;
## ===== 100 fields =====
## c_readarray   real  0m0.069s   user 0m0.000s   sys  0m0.062s
## c_read        real  0m0.065s   user 0m0.000s   sys  0m0.046s
## c_regex       real  0m0.005s   user 0m0.000s   sys  0m0.000s
##
testAll_splitIntoArray 1000;
## ===== 1000 fields =====
## c_readarray   real  0m0.084s   user 0m0.031s   sys  0m0.077s
## c_read        real  0m0.092s   user 0m0.031s   sys  0m0.046s
## c_regex       real  0m0.125s   user 0m0.125s   sys  0m0.000s
##
testAll_splitIntoArray 10000;
## ===== 10000 fields =====
## c_readarray   real  0m0.209s   user 0m0.093s   sys  0m0.108s
## c_read        real  0m0.333s   user 0m0.234s   sys  0m0.109s
## c_regex       real  0m9.095s   user 0m9.078s   sys  0m0.000s
##
testAll_splitIntoArray 100000;
## ===== 100000 fields =====
## c_readarray   real  0m1.460s   user 0m0.326s   sys  0m1.124s
## c_read        real  0m2.780s   user 0m1.686s   sys  0m1.092s
## c_regex       real  17m38.208s   user 15m16.359s   sys  2m19.375s
##

bgoldst ,Nov 27, 2017 at 4:28

Very cool solution! I never thought of using a loop on a regex match, nifty use of $BASH_REMATCH . It works, and does indeed avoid spawning subshells. +1 from me. However, by way of criticism, the regex itself is a little non-ideal, in that it appears you were forced to duplicate part of the delimiter token (specifically the comma) so as to work around the lack of support for non-greedy multipliers (also lookarounds) in ERE ("extended" regex flavor built into bash). This makes it a little less generic and robust. – bgoldst Nov 27 '17 at 4:28

bgoldst ,Nov 27, 2017 at 4:28

Secondly, I did some benchmarking, and although the performance is better than the other solutions for smallish strings, it worsens exponentially due to the repeated string-rebuilding, becoming catastrophic for very large strings. See my edit to your answer. – bgoldst Nov 27 '17 at 4:28

dawg ,Nov 27, 2017 at 4:46

@bgoldst: What a cool benchmark! In defense of the regex, for 10's or 100's of thousands of fields (what the regex is splitting) there would probably be some form of record (like \n delimited text lines) comprising those fields so the catastrophic slow-down would likely not occur. If you have a string with 100,000 fields -- maybe Bash is not ideal ;-) Thanks for the benchmark. I learned a thing or two. – dawg Nov 27 '17 at 4:46

Geoff Lee ,Mar 4, 2016 at 6:02

Try this
IFS=', '; array=(Paris, France, Europe)
for item in ${array[@]}; do echo $item; done

It's simple. If you want, you can also add a declare (and also remove the commas):

IFS=' ';declare -a array=(Paris France Europe)

The IFS is added to undo the above but it works without it in a fresh bash instance

MrPotatoHead ,Nov 13, 2018 at 13:19

Pure bash multi-character delimiter solution.

As others have pointed out in this thread, the OP's question gave an example of a comma delimited string to be parsed into an array, but did not indicate if he/she was only interested in comma delimiters, single character delimiters, or multi-character delimiters.

Since Google tends to rank this answer at or near the top of search results, I wanted to provide readers with a strong answer to the question of multiple character delimiters, since that is also mentioned in at least one response.

If you're in search of a solution to a multi-character delimiter problem, I suggest reviewing Mallikarjun M 's post, in particular the response from gniourf_gniourf who provides this elegant pure BASH solution using parameter expansion:

#!/bin/bash
str="LearnABCtoABCSplitABCaABCString"
delimiter=ABC
s=$str$delimiter
array=();
while [[ $s ]]; do
    array+=( "${s%%"$delimiter"*}" );
    s=${s#*"$delimiter"};
done;
declare -p array

Link to cited comment/referenced post

Link to cited question: Howto split a string on a multi-character delimiter in bash?

Eduardo Cuomo ,Dec 19, 2016 at 15:27

Use this:
countries='Paris, France, Europe'
OIFS="$IFS"
IFS=', ' array=($countries)
IFS="$OIFS"

#${array[1]} == Paris
#${array[2]} == France
#${array[3]} == Europe

gniourf_gniourf ,Dec 19, 2016 at 17:22

Bad: subject to word splitting and pathname expansion. Please don't revive old questions with good answers to give bad answers. – gniourf_gniourf Dec 19 '16 at 17:22

Scott Weldon ,Dec 19, 2016 at 18:12

This may be a bad answer, but it is still a valid answer. Flaggers / reviewers: For incorrect answers such as this one, downvote, don't delete! – Scott Weldon Dec 19 '16 at 18:12

George Sovetov ,Dec 26, 2016 at 17:31

@gniourf_gniourf Could you please explain why it is a bad answer? I really don't understand when it fails. – George Sovetov Dec 26 '16 at 17:31

gniourf_gniourf ,Dec 26, 2016 at 18:07

@GeorgeSovetov: As I said, it's subject to word splitting and pathname expansion. More generally, splitting a string into an array as array=( $string ) is a (sadly very common) antipattern: word splitting occurs: string='Prague, Czech Republic, Europe' ; Pathname expansion occurs: string='foo[abcd],bar[efgh]' will fail if you have a file named, e.g., food or barf in your directory. The only valid usage of such a construct is when string is a glob. – gniourf_gniourf Dec 26 '16 at 18:07

user1009908 ,Jun 9, 2015 at 23:28

UPDATE: Don't do this, due to problems with eval.

With slightly less ceremony:

IFS=', ' eval 'array=($string)'

e.g.

string="foo, bar,baz"
IFS=', ' eval 'array=($string)'
echo ${array[1]} # -> bar

caesarsol ,Oct 29, 2015 at 14:42

eval is evil! don't do this. – caesarsol Oct 29 '15 at 14:42

user1009908 ,Oct 30, 2015 at 4:05

Pfft. No. If you're writing scripts large enough for this to matter, you're doing it wrong. In application code, eval is evil. In shell scripting, it's common, necessary, and inconsequential. – user1009908 Oct 30 '15 at 4:05

caesarsol ,Nov 2, 2015 at 18:19

put a $ in your variable and you'll see... I write many scripts and I never ever had to use a single eval – caesarsol Nov 2 '15 at 18:19

Dennis Williamson ,Dec 2, 2015 at 17:00

Eval command and security issues – Dennis Williamson Dec 2 '15 at 17:00

user1009908 ,Dec 22, 2015 at 23:04

You're right, this is only usable when the input is known to be clean. Not a robust solution. – user1009908 Dec 22 '15 at 23:04

Eduardo Lucio ,Jan 31, 2018 at 20:45

Here's my hack!

Splitting strings by strings is a pretty boring thing to do using bash. What happens is that we have limited approaches that only work in a few cases (split by ";", "/", "." and so on) or we have a variety of side effects in the outputs.

The approach below has required a number of maneuvers, but I believe it will work for most of our needs!

#!/bin/bash

# --------------------------------------
# SPLIT FUNCTION
# ----------------

F_SPLIT_R=()
f_split() {
    : 'It does a "split" into a given string and returns an array.

    Args:
        TARGET_P (str): Target string to "split".
        DELIMITER_P (Optional[str]): Delimiter used to "split". If not 
    informed the split will be done by spaces.

    Returns:
        F_SPLIT_R (array): Array with the provided string separated by the 
    informed delimiter.
    '

    F_SPLIT_R=()
    TARGET_P=$1
    DELIMITER_P=$2
    if [ -z "$DELIMITER_P" ] ; then
        DELIMITER_P=" "
    fi

    REMOVE_N=1
    if [ "$DELIMITER_P" == "\n" ] ; then
        REMOVE_N=0
    fi

    # NOTE: This was the only parameter that has been a problem so far! 
    # By Questor
    # [Ref.: https://unix.stackexchange.com/a/390732/61742]
    if [ "$DELIMITER_P" == "./" ] ; then
        DELIMITER_P="[.]/"
    fi

    if [ ${REMOVE_N} -eq 1 ] ; then

        # NOTE: Due to bash limitations we have some problems getting the 
        # output of a split by awk inside an array and so we need to use 
        # "line break" (\n) to succeed. Seen this, we remove the line breaks 
        # momentarily afterwards we reintegrate them. The problem is that if 
        # there is a line break in the "string" informed, this line break will 
        # be lost, that is, it is erroneously removed in the output! 
        # By Questor
        TARGET_P=$(awk 'BEGIN {RS="dn"} {gsub("\n", "3F2C417D448C46918289218B7337FCAF"); printf $0}' <<< "${TARGET_P}")

    fi

    # NOTE: The replace of "\n" by "3F2C417D448C46918289218B7337FCAF" results 
    # in more occurrences of "3F2C417D448C46918289218B7337FCAF" than the 
    # amount of "\n" that there was originally in the string (one more 
    # occurrence at the end of the string)! We can not explain the reason for 
    # this side effect. The line below corrects this problem! By Questor
    TARGET_P=${TARGET_P%????????????????????????????????}

    SPLIT_NOW=$(awk -F"$DELIMITER_P" '{for(i=1; i<=NF; i++){printf "%s\n", $i}}' <<< "${TARGET_P}")

    while IFS= read -r LINE_NOW ; do
        if [ ${REMOVE_N} -eq 1 ] ; then

            # NOTE: We use "'" to prevent blank lines with no other characters 
            # in the sequence being erroneously removed! We do not know the 
            # reason for this side effect! By Questor
            LN_NOW_WITH_N=$(awk 'BEGIN {RS="dn"} {gsub("3F2C417D448C46918289218B7337FCAF", "\n"); printf $0}' <<< "'${LINE_NOW}'")

            # NOTE: We use the commands below to revert the intervention made 
            # immediately above! By Questor
            LN_NOW_WITH_N=${LN_NOW_WITH_N%?}
            LN_NOW_WITH_N=${LN_NOW_WITH_N#?}

            F_SPLIT_R+=("$LN_NOW_WITH_N")
        else
            F_SPLIT_R+=("$LINE_NOW")
        fi
    done <<< "$SPLIT_NOW"
}

# --------------------------------------
# HOW TO USE
# ----------------

STRING_TO_SPLIT="
 * How do I list all databases and tables using psql?

\"
sudo -u postgres /usr/pgsql-9.4/bin/psql -c \"\l\"
sudo -u postgres /usr/pgsql-9.4/bin/psql <DB_NAME> -c \"\dt\"
\"

\"
\list or \l: list all databases
\dt: list all tables in the current database
\"

[Ref.: https://dba.stackexchange.com/questions/1285/how-do-i-list-all-databases-and-tables-using-psql]


"

f_split "$STRING_TO_SPLIT" "bin/psql -c"

# --------------------------------------
# OUTPUT AND TEST
# ----------------

ARR_LENGTH=${#F_SPLIT_R[*]}
for (( i=0; i<=$(( $ARR_LENGTH -1 )); i++ )) ; do
    echo " > -----------------------------------------"
    echo "${F_SPLIT_R[$i]}"
    echo " < -----------------------------------------"
done

if [ "$STRING_TO_SPLIT" == "${F_SPLIT_R[0]}bin/psql -c${F_SPLIT_R[1]}" ] ; then
    echo " > -----------------------------------------"
    echo "The strings are the same!"
    echo " < -----------------------------------------"
fi

sel-en-ium ,May 31, 2018 at 5:56

Another way to do it without modifying IFS:
read -r -a myarray <<< "${string//, /$IFS}"

Rather than changing IFS to match our desired delimiter, we can replace all occurrences of our desired delimiter ", " with contents of $IFS via "${string//, /$IFS}" .

Maybe this will be slow for very large strings though?

This is based on Dennis Williamson's answer.

rsjethani ,Sep 13, 2016 at 16:21

Another approach can be:
str="a, b, c, d"  # assuming there is a space after ',' as in Q
arr=(${str//,/})  # delete all occurrences of ','

After this 'arr' is an array with four strings. This doesn't require dealing IFS or read or any other special stuff hence much simpler and direct.

gniourf_gniourf ,Dec 26, 2016 at 18:12

Same (sadly common) antipattern as other answers: subject to word splitting and filename expansion. – gniourf_gniourf Dec 26 '16 at 18:12

Safter Arslan ,Aug 9, 2017 at 3:21

Another way would be:
string="Paris, France, Europe"
IFS=', ' arr=(${string})

Now your elements are stored in "arr" array. To iterate through the elements:

for i in ${arr[@]}; do echo $i; done

bgoldst ,Aug 13, 2017 at 22:38

I cover this idea in my answer ; see Wrong answer #5 (you might be especially interested in my discussion of the eval trick). Your solution leaves $IFS set to the comma-space value after-the-fact. – bgoldst Aug 13 '17 at 22:38

[Jan 28, 2019] regex - Safe rm -rf function in shell script

Jan 28, 2019 | stackoverflow.com

community wiki
5 revs
,May 23, 2017 at 12:26

This question is similar to What is the safest way to empty a directory in *nix?

I'm writing bash script which defines several path constants and will use them for file and directory manipulation (copying, renaming and deleting). Often it will be necessary to do something like:

rm -rf "/${PATH1}"
rm -rf "${PATH2}/"*

While developing this script I'd want to protect myself from mistyping names like PATH1 and PATH2 and avoid situations where they are expanded to empty string, thus resulting in wiping whole disk. I decided to create special wrapper:

rmrf() {
    if [[ $1 =~ "regex" ]]; then
        echo "Ignoring possibly unsafe path ${1}"
        exit 1
    fi

    shopt -s dotglob
    rm -rf -- $1
    shopt -u dotglob
}

Which will be called as:

rmrf "/${PATH1}"
rmrf "${PATH2}/"*

Regex (or sed expression) should catch paths like "*", "/*", "/**/", "///*" etc. but allow paths like "dir", "/dir", "/dir1/dir2/", "/dir1/dir2/*". Also I don't know how to enable shell globbing in case like "/dir with space/*". Any ideas?

EDIT: this is what I came up with so far:

rmrf() {
    local RES
    local RMPATH="${1}"
    SAFE=$(echo "${RMPATH}" | sed -r 's:^((\.?\*+/+)+.*|(/+\.?\*+)+.*|[\.\*/]+|.*/\.\*+)$::g')
    if [ -z "${SAFE}" ]; then
        echo "ERROR! Unsafe deletion of ${RMPATH}"
        return 1
    fi

    shopt -s dotglob
    if [ '*' == "${RMPATH: -1}" ]; then
        echo rm -rf -- "${RMPATH/%\*/}"*
        RES=$?
    else
        echo rm -rf -- "${RMPATH}"
        RES=$?
    fi
    shopt -u dotglob

    return $RES
}

Intended use is (note an asterisk inside quotes):

rmrf "${SOMEPATH}"
rmrf "${SOMEPATH}/*"

where $SOMEPATH is not system or /home directory (in my case all such operations are performed on filesystem mounted under /scratch directory).

CAVEATS:

SpliFF ,Jun 14, 2009 at 13:45

I've found a big danger with rm in bash is that bash usually doesn't stop for errors. That means that:
cd $SOMEPATH
rm -rf *

Is a very dangerous combination if the change directory fails. A safer way would be:

cd $SOMEPATH && rm -rf *

Which will ensure the rf won't run unless you are really in $SOMEPATH. This doesn't protect you from a bad $SOMEPATH but it can be combined with the advice given by others to help make your script safer.

EDIT: @placeybordeaux makes a good point that if $SOMEPATH is undefined or empty cd doesn't treat it as an error and returns 0. In light of that this answer should be considered unsafe unless $SOMEPATH is validated as existing and non-empty first. I believe cd with no args should be an illegal command since at best is performs a no-op and at worse it can lead to unexpected behaviour but it is what it is.

Sazzad Hissain Khan ,Jul 6, 2017 at 11:45

nice trick, I am one stupid victim. – Sazzad Hissain Khan Jul 6 '17 at 11:45

placeybordeaux ,Jun 21, 2018 at 22:59

If $SOMEPATH is empty won't this rm -rf the user's home directory? – placeybordeaux Jun 21 '18 at 22:59

SpliFF ,Jun 27, 2018 at 4:10

@placeybordeaux The && only runs the second command if the first succeeds - so if cd fails rm never runs – SpliFF Jun 27 '18 at 4:10

placeybordeaux ,Jul 3, 2018 at 18:46

@SpliFF at least in ZSH the return value of cd $NONEXISTANTVAR is 0placeybordeaux Jul 3 '18 at 18:46

ruakh ,Jul 13, 2018 at 6:46

Instead of cd $SOMEPATH , you should write cd "${SOMEPATH?}" . The ${varname?} notation ensures that the expansion fails with a warning-message if the variable is unset or empty (such that the && ... part is never run); the double-quotes ensure that special characters in $SOMEPATH , such as whitespace, don't have undesired effects. – ruakh Jul 13 '18 at 6:46

community wiki
2 revs
,Jul 24, 2009 at 22:36

There is a set -u bash directive that will cause exit, when uninitialized variable is used. I read about it here , with rm -rf as an example. I think that's what you're looking for. And here is set's manual .

,Jun 14, 2009 at 12:38

I think "rm" command has a parameter to avoid the deleting of "/". Check it out.

Max ,Jun 14, 2009 at 12:56

Thanks! I didn't know about such option. Actually it is named --preserve-root and is not mentioned in the manpage. – Max Jun 14 '09 at 12:56

Max ,Jun 14, 2009 at 13:18

On my system this option is on by default, but it cat't help in case like rm -ri /* – Max Jun 14 '09 at 13:18

ynimous ,Jun 14, 2009 at 12:42

I would recomend to use realpath(1) and not the command argument directly, so that you can avoid things like /A/B/../ or symbolic links.

Max ,Jun 14, 2009 at 13:30

Useful but non-standard command. I've found possible bash replacement: archlinux.org/pipermail/pacman-dev/2009-February/008130.htmlMax Jun 14 '09 at 13:30

Jonathan Leffler ,Jun 14, 2009 at 12:47

Generally, when I'm developing a command with operations such as ' rm -fr ' in it, I will neutralize the remove during development. One way of doing that is:
RMRF="echo rm -rf"
...
$RMRF "/${PATH1}"

This shows me what should be deleted - but does not delete it. I will do a manual clean up while things are under development - it is a small price to pay for not running the risk of screwing up everything.

The notation ' "/${PATH1}" ' is a little unusual; normally, you would ensure that PATH1 simply contains an absolute pathname.

Using the metacharacter with ' "${PATH2}/"* ' is unwise and unnecessary. The only difference between using that and using just ' "${PATH2}" ' is that if the directory specified by PATH2 contains any files or directories with names starting with dot, then those files or directories will not be removed. Such a design is unlikely and is rather fragile. It would be much simpler just to pass PATH2 and let the recursive remove do its job. Adding the trailing slash is not necessarily a bad idea; the system would have to ensure that $PATH2 contains a directory name, not just a file name, but the extra protection is rather minimal.

Using globbing with ' rm -fr ' is usually a bad idea. You want to be precise and restrictive and limiting in what it does - to prevent accidents. Of course, you'd never run the command (shell script you are developing) as root while it is under development - that would be suicidal. Or, if root privileges are absolutely necessary, you neutralize the remove operation until you are confident it is bullet-proof.

Max ,Jun 14, 2009 at 13:09

To delete subdirectories and files starting with dot I use "shopt -s dotglob". Using rm -rf "${PATH2}" is not appropriate because in my case PATH2 can be only removed by superuser and this results in error status for "rm" command (and I verify it to track other errors). – Max Jun 14 '09 at 13:09

Jonathan Leffler ,Jun 14, 2009 at 13:37

Then, with due respect, you should use a private sub-directory under $PATH2 that you can remove. Avoid glob expansion with commands like 'rm -rf' like you would avoid the plague (or should that be A/H1N1?). – Jonathan Leffler Jun 14 '09 at 13:37

Max ,Jun 14, 2009 at 14:10

Meanwhile I've found this perl project: http://code.google.com/p/safe-rm/

community wiki
too much php
,Jun 15, 2009 at 1:55

If it is possible, you should try and put everything into a folder with a hard-coded name which is unlikely to be found anywhere else on the filesystem, such as ' foofolder '. Then you can write your rmrf() function as:
rmrf() {
    rm -rf "foofolder/$PATH1"
    # or
    rm -rf "$PATH1/foofolder"
}

There is no way that function can delete anything but the files you want it to.

vadipp ,Jan 13, 2017 at 11:37

Actually there is a way: if PATH1 is something like ../../someotherdirvadipp Jan 13 '17 at 11:37

community wiki
btop
,Jun 15, 2009 at 6:34

You may use
set -f    # cf. help set

to disable filename generation (*).

community wiki
Howard Hong
,Oct 28, 2009 at 19:56

You don't need to use regular expressions.
Just assign the directories you want to protect to a variable and then iterate over the variable. eg:
protected_dirs="/ /bin /usr/bin /home $HOME"
for d in $protected_dirs; do
    if [ "$1" = "$d" ]; then
        rm=0
        break;
    fi
done
if [ ${rm:-1} -eq 1 ]; then
    rm -rf $1
fi

,

Add the following codes to your ~/.bashrc
# safe delete
move_to_trash () { now="$(date +%Y%m%d_%H%M%S)"; mv "$@" ~/.local/share/Trash/files/"$@_$now"; }
alias del='move_to_trash'

# safe rm
alias rmi='rm -i'

Every time you need to rm something, first consider del , you can change the trash folder. If you do need to rm something, you could go to the trash folder and use rmi .

One small bug for del is that when del a folder, for example, my_folder , it should be del my_folder but not del my_folder/ since in order for possible later restore, I attach the time information in the end ( "$@_$now" ). For files, it works fine.

[Jan 17, 2019] How do I launch the default web browser in Perl on any operating system

Jan 17, 2019 | stackoverflow.com

The second hit on "open url" at search.cpan brings up Browser::Open:

use Browser::Open qw( open_browser );

my $url = 'http://www.google.com/';
open_browser($url);

If your OS isn't supported, send a patch or a bug report.

--cjm

More at Stack Overflow More at Stack Overflow

[Jan 10, 2019] linux - How does cat EOF work in bash - Stack Overflow

Notable quotes:
"... The $sql variable now holds the new-line characters too. You can verify with echo -e "$sql" . ..."
"... The print.sh file now contains: ..."
"... The b.txt file contains bar and baz lines. The same output is printed to stdout . ..."
Jan 10, 2019 | stackoverflow.com

How does "cat << EOF" work in bash? Ask Question 454


hasen ,Mar 23, 2010 at 13:57

I needed to write a script to enter multi-line input to a program ( psql ).

After a bit of googling, I found the following syntax works:

cat << EOF | psql ---params
BEGIN;

`pg_dump ----something`

update table .... statement ...;

END;
EOF

This correctly constructs the multi-line string (from BEGIN; to END; , inclusive) and pipes it as an input to psql .

But I have no idea how/why it works, can some one please explain?

I'm referring mainly to cat << EOF , I know > outputs to a file, >> appends to a file, < reads input from file.

What does << exactly do?

And is there a man page for it?

Dennis Williamson ,Mar 23, 2010 at 18:28

That's probably a useless use of cat . Try psql ... << EOF ... See also "here strings". mywiki.wooledge.org/BashGuide/InputAndOutput?#Here_StringsDennis Williamson Mar 23 '10 at 18:28

hasen ,Mar 23, 2010 at 18:54

@Dennis: good point, and thanks for the link! – hasen Mar 23 '10 at 18:54

Alex ,Mar 23, 2015 at 23:31

I'm surprised it works with cat but not with echo. cat should expect a file name as stdin, not a char string. psql << EOF sounds logical, but not othewise. Works with cat but not with echo. Strange behaviour. Any clue about that? – Alex Mar 23 '15 at 23:31

Alex ,Mar 23, 2015 at 23:39

Answering to myself: cat without parameters executes and replicates to the output whatever send via input (stdin), hence using its output to fill the file via >. In fact a file name read as a parameter is not a stdin stream. – Alex Mar 23 '15 at 23:39

The-null-Pointer- ,Jan 1, 2018 at 18:03

@Alex echo just prints it's command line arguments while cat reads stding(when piped to it) or reads a file that corresponds to it's command line args – The-null-Pointer- Jan 1 '18 at 18:03

kennytm ,Mar 23, 2010 at 13:58

This is called heredoc format to provide a string into stdin. See https://en.wikipedia.org/wiki/Here_document#Unix_shells for more details.

From man bash :

Here Documents

This type of redirection instructs the shell to read input from the current source until a line containing only word (with no trailing blanks) is seen.

All of the lines read up to that point are then used as the standard input for a command.

The format of here-documents is:

          <<[-]word
                  here-document
          delimiter

No parameter expansion, command substitution, arithmetic expansion, or pathname expansion is performed on word . If any characters in word are quoted, the delimiter is the result of quote removal on word , and the lines in the here-document are not expanded. If word is unquoted, all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion. In the latter case, the character sequence \<newline> is ignored, and \ must be used to quote the characters \ , $ , and ` .

If the redirection operator is <<- , then all leading tab characters are stripped from input lines and the line containing delimiter . This allows here-documents within shell scripts to be indented in a natural fashion.

Xeoncross ,May 26, 2011 at 22:51

I was having the hardest time disabling variable/parameter expansion. All I needed to do was use "double-quotes" and that fixed it! Thanks for the info! – Xeoncross May 26 '11 at 22:51

trkoch ,Nov 10, 2015 at 17:23

Concerning <<- please note that only leading tab characters are stripped -- not soft tab characters. This is one of those rare case when you actually need the tab character. If the rest of your document uses soft tabs, make sure to show invisible characters and (e.g.) copy and paste a tab character. If you do it right, your syntax highlighting should correctly catch the ending delimiter. – trkoch Nov 10 '15 at 17:23

BrDaHa ,Jul 13, 2017 at 19:01

I don't see how this answer is more helpful than the ones below. It merely regurgitates information that can be found in other places (that have likely already been checked) – BrDaHa Jul 13 '17 at 19:01

Vojtech Vitek ,Feb 4, 2014 at 10:28

The cat <<EOF syntax is very useful when working with multi-line text in Bash, eg. when assigning multi-line string to a shell variable, file or a pipe. Examples of cat <<EOF syntax usage in Bash: 1. Assign multi-line string to a shell variable
$ sql=$(cat <<EOF
SELECT foo, bar FROM db
WHERE foo='baz'
EOF
)

The $sql variable now holds the new-line characters too. You can verify with echo -e "$sql" .

2. Pass multi-line string to a file in Bash
$ cat <<EOF > print.sh
#!/bin/bash
echo \$PWD
echo $PWD
EOF

The print.sh file now contains:

#!/bin/bash
echo $PWD
echo /home/user
3. Pass multi-line string to a pipe in Bash
$ cat <<EOF | grep 'b' | tee b.txt
foo
bar
baz
EOF

The b.txt file contains bar and baz lines. The same output is printed to stdout .

edelans ,Aug 22, 2014 at 8:48

In your case, "EOF" is known as a "Here Tag". Basically <<Here tells the shell that you are going to enter a multiline string until the "tag" Here . You can name this tag as you want, it's often EOF or STOP .

Some rules about the Here tags:

  1. The tag can be any string, uppercase or lowercase, though most people use uppercase by convention.
  2. The tag will not be considered as a Here tag if there are other words in that line. In this case, it will merely be considered part of the string. The tag should be by itself on a separate line, to be considered a tag.
  3. The tag should have no leading or trailing spaces in that line to be considered a tag. Otherwise it will be considered as part of the string.

example:

$ cat >> test <<HERE
> Hello world HERE <-- Not by itself on a separate line -> not considered end of string
> This is a test
>  HERE <-- Leading space, so not considered end of string
> and a new line
> HERE <-- Now we have the end of the string

oemb1905 ,Feb 22, 2017 at 7:17

this is the best actual answer ... you define both and clearly state the primary purpose of the use instead of related theory ... which is important but not necessary ... thanks - super helpful – oemb1905 Feb 22 '17 at 7:17

The-null-Pointer- ,Jan 1, 2018 at 18:05

@edelans you must add that when <<- is used leading tab will not prevent the tag from being recognized – The-null-Pointer- Jan 1 '18 at 18:05

JawSaw ,Oct 28, 2018 at 13:44

your answer clicked me on "you are going to enter a multiline string" – JawSaw Oct 28 '18 at 13:44

Ciro Santilli 新疆改造中心 六四事件 法轮功 ,Jun 9, 2015 at 9:41

POSIX 7

kennytm quoted man bash , but most of that is also POSIX 7: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04 :

The redirection operators "<<" and "<<-" both allow redirection of lines contained in a shell input file, known as a "here-document", to the input of a command.

The here-document shall be treated as a single word that begins after the next and continues until there is a line containing only the delimiter and a , with no characters in between. Then the next here-document starts, if there is one. The format is as follows:

[n]<<word
    here-document
delimiter

where the optional n represents the file descriptor number. If the number is omitted, the here-document refers to standard input (file descriptor 0).

If any character in word is quoted, the delimiter shall be formed by performing quote removal on word, and the here-document lines shall not be expanded. Otherwise, the delimiter shall be the word itself.

If no characters in word are quoted, all lines of the here-document shall be expanded for parameter expansion, command substitution, and arithmetic expansion. In this case, the in the input behaves as the inside double-quotes (see Double-Quotes). However, the double-quote character ( '"' ) shall not be treated specially within a here-document, except when the double-quote appears within "$()", "``", or "${}".

If the redirection symbol is "<<-", all leading <tab> characters shall be stripped from input lines and the line containing the trailing delimiter. If more than one "<<" or "<<-" operator is specified on a line, the here-document associated with the first operator shall be supplied first by the application and shall be read first by the shell.

When a here-document is read from a terminal device and the shell is interactive, it shall write the contents of the variable PS2, processed as described in Shell Variables, to standard error before reading each line of input until the delimiter has been recognized.

Examples

Some examples not yet given.

Quotes prevent parameter expansion

Without quotes:

a=0
cat <<EOF
$a
EOF

Output:

0

With quotes:

a=0
cat <<'EOF'
$a
EOF

or (ugly but valid):

a=0
cat <<E"O"F
$a
EOF

Outputs:

$a
Hyphen removes leading tabs

Without hyphen:

cat <<EOF
<tab>a
EOF

where <tab> is a literal tab, and can be inserted with Ctrl + V <tab>

Output:

<tab>a

With hyphen:

cat <<-EOF
<tab>a
<tab>EOF

Output:

a

This exists of course so that you can indent your cat like the surrounding code, which is easier to read and maintain. E.g.:

if true; then
    cat <<-EOF
    a
    EOF
fi

Unfortunately, this does not work for space characters: POSIX favored tab indentation here. Yikes.

David C. Rankin ,Aug 12, 2015 at 7:10

In your last example discussing <<- and <tab>a , it should be noted that the purpose was to allow normal indentation of code within the script while allowing heredoc text presented to the receiving process to begin in column 0. It is a not too commonly seen feature and a bit more context may prevent a good deal of head-scratching... – David C. Rankin Aug 12 '15 at 7:10

Ciro Santilli 新疆改造中心 六四事件 法轮功 ,Aug 12, 2015 at 8:22

@DavidC.Rankin updated to clarify that, thanks. – Ciro Santilli 新疆改造中心 六四事件 法轮功 Aug 12 '15 at 8:22

Jeanmichel Cote ,Sep 23, 2015 at 19:58

How should i escape expension if some of the content in between my EOF tags needs to be expanded and some don't? – Jeanmichel Cote Sep 23 '15 at 19:58

Jeanmichel Cote ,Sep 23, 2015 at 20:00

...just use the backslash in front of the $Jeanmichel Cote Sep 23 '15 at 20:00

Ciro Santilli 新疆改造中心 六四事件 法轮功 ,Sep 23, 2015 at 20:01

@JeanmichelCote I don't see a better option :-) With regular strings you can also consider mixing up quotes like "$a"'$b'"$c" , but there is no analogue here AFAIK. – Ciro Santilli 新疆改造中心 六四事件 法轮功 Sep 23 '15 at 20:01

Andreas Maier ,Feb 13, 2017 at 12:14

Using tee instead of cat

Not exactly as an answer to the original question, but I wanted to share this anyway: I had the need to create a config file in a directory that required root rights.

The following does not work for that case:

$ sudo cat <<EOF >/etc/somedir/foo.conf
# my config file
foo=bar
EOF

because the redirection is handled outside of the sudo context.

I ended up using this instead:

$ sudo tee <<EOF /etc/somedir/foo.conf >/dev/null
# my config file
foo=bar
EOF

user9048395

add a comment ,Jun 6, 2018 at 0:15
This isn't necessarily an answer to the original question, but a sharing of some results from my own testing. This:
<<test > print.sh
#!/bin/bash
echo \$PWD
echo $PWD
test

will produce the same file as:

cat <<test > print.sh
#!/bin/bash
echo \$PWD
echo $PWD
test

So, I don't see the point of using the cat command.

> ,Dec 19, 2013 at 21:40

Worth noting that here docs work in bash loops too. This example shows how-to get the column list of table:
export postgres_db_name='my_db'
export table_name='my_table_name'

# start copy 
while read -r c; do test -z "$c" || echo $table_name.$c , ; done < <(cat << EOF | psql -t -q -d $postgres_db_name -v table_name="${table_name:-}"
SELECT column_name
FROM information_schema.columns
WHERE 1=1
AND table_schema = 'public'
AND table_name   =:'table_name'  ;
EOF
)
# stop copy , now paste straight into the bash shell ...

output: 
my_table_name.guid ,
my_table_name.id ,
my_table_name.level ,
my_table_name.seq ,

or even without the new line

while read -r c; do test -z "$c" || echo $table_name.$c , | perl -ne 
's/\n//gm;print' ; done < <(cat << EOF | psql -t -q -d $postgres_db_name -v table_name="${table_name:-}"
 SELECT column_name
 FROM information_schema.columns
 WHERE 1=1
 AND table_schema = 'public'
 AND table_name   =:'table_name'  ;
 EOF
 )

 # output: daily_issues.guid ,daily_issues.id ,daily_issues.level ,daily_issues.seq ,daily_issues.prio ,daily_issues.weight ,daily_issues.status ,daily_issues.category ,daily_issues.name ,daily_issues.description ,daily_issues.type ,daily_issues.owner

[Jan 03, 2019] Using Lua for working with excel - Stack Overflow

Jan 03, 2019 | stackoverflow.com

Using Lua for working with excel Ask Question 2


Animesh ,Oct 14, 2009 at 12:04

I am planning to learn Lua for my desktop scripting needs. I want to know if there is any documentation available and also if there are all the things needed in the Standard Lib.

uroc ,Oct 14, 2009 at 12:09

You should check out Lua for Windows -- a 'batteries included environment' for the Lua scripting language on Windows

http://luaforwindows.luaforge.net/

It includes the LuaCOM library, from which you can access the Excel COM object.

Try looking at the LuaCOM documentation, there are some Excel examples in that:

http://www.tecgraf.puc-rio.br/~rcerq/luacom/pub/1.3/luacom-htmldoc/

I've only ever used this for very simplistic things. Here is a sample to get you started:

-- test.lua
require('luacom')
excel = luacom.CreateObject("Excel.Application")
excel.Visible = true
wb = excel.Workbooks:Add()
ws = wb.Worksheets(1)

for i=1, 20 do
    ws.Cells(i,1).Value2 = i
end

Animesh ,Oct 14, 2009 at 12:26

Thanks uroc for your quick response. If possible, please let me know of any beginner tutorial or atleast some sample code for using COM programming via Lua. :) – Animesh Oct 14 '09 at 12:26

sagasw ,Oct 16, 2009 at 1:02

More complex code example for lua working with excel:
require "luacom"

excel = luacom.CreateObject("Excel.Application")

local book  = excel.Workbooks:Add()
local sheet = book.Worksheets(1)

excel.Visible = true

for row=1, 30 do
  for col=1, 30 do
    sheet.Cells(row, col).Value2 = math.floor(math.random() * 100)
  end
end


local range = sheet:Range("A1")

for row=1, 30 do
  for col=1, 30 do
    local v = sheet.Cells(row, col).Value2

    if v > 50 then
        local cell = range:Offset(row-1, col-1)

        cell:Select()
        excel.Selection.Interior.Color = 65535
    end
  end
end

excel.DisplayAlerts = false
excel:Quit()
excel = nil

Another example, could add a graph chart.

require "luacom"

excel = luacom.CreateObject("Excel.Application")

local book  = excel.Workbooks:Add()
local sheet = book.Worksheets(1)

excel.Visible = true

for row=1, 30 do
  sheet.Cells(row, 1).Value2 = math.floor(math.random() * 100)
end

local chart = excel.Charts:Add()
chart.ChartType = 4  --  xlLine

local range = sheet:Range("A1:A30")
chart:SetSourceData(range)

Incredulous Monk ,Oct 19, 2009 at 4:17

A quick suggestion: fragments of code will look better if you format them as code (use the little "101 010" button). – Incredulous Monk Oct 19 '09 at 4:17

[Jan 01, 2019] mc - How can I set the default (user defined) listing mode in Midnight Commander- - Unix Linux Stack Exchange

Jan 01, 2019 | unix.stackexchange.com

Ask Question 0

papaiatis ,Jul 14, 2016 at 11:51

I defined my own listing mode and I'd like to make it permanent so that on the next mc start my defined listing mode will be set. I found no configuration file for mc.

,

You have probably Auto save setup turned off in Options->Configuration menu.

You can save the configuration manually by Options->Save setup .

Panels setup is saved to ~/.config/mc/panels.ini .

[Dec 05, 2018] How to make putty ssh connection never to timeout when user is idle?

Dec 05, 2018 | askubuntu.com

David MZ ,Feb 13, 2013 at 18:07

I have a Ubuntu 12.04 server I bought, if I connect with putty using ssh and a sudoer user putty gets disconnected by the server after some time if I am idle How do I configure Ubuntu to keep this connection alive indefinitely?

das Keks ,Feb 13, 2013 at 18:24

If you go to your putty settings -> Connection and set the value of "Seconds between keepalives" to 30 seconds this should solve your problem.

kokbira ,Feb 19 at 11:42

?????? "0 to turn off" or 30 to turn off????????? I think he must put 0 instead of 30! – kokbira Feb 19 at 11:42

das Keks ,Feb 19 at 11:46

No, it's the time between keepalives. If you set it to 0, no keepalives are sent but you want putty to send keepalives to keep the connection alive. – das Keks Feb 19 at 11:46

Aaron ,Mar 19 at 20:39

I did this but still it drops.. – Aaron Mar 19 at 20:39

0xC0000022L ,Feb 13, 2013 at 19:29

In addition to the answer from "das Keks" there is at least one other aspect that can affect this behavior. Bash (usually the default shell on Ubuntu) has a value TMOUT which governs (decimal value in seconds) after which time an idle shell session will time out and the user will be logged out, leading to a disconnect in an SSH session.

In addition I would strongly recommend that you do something else entirely. Set up byobu (or even just tmux alone as it's superior to GNU screen ) and always log in and attach to a preexisting session (that's GNU screen and tmux terminology). This way even if you get forcibly disconnected - let's face it, a power outage or network interruption can always happen - you can always resume your work where you left. And that works across different machines. So you can connect to the same session from another machine (e.g. from home). The possibilities are manifold and it's a true productivity booster. And not to forget, terminal multiplexers overcome one of the big disadvantages of PuTTY: no tabbed interface. Now you get "tabs" in the form of windows and panes inside GNU screen and tmux .

apt-get install tmux
apt-get install byobu

Byobu is a nice frontend to both terminal multiplexers, but tmux is so comfortable that in my opinion it obsoletes byobu to a large extent. So my recommendation would be tmux .

Also search for "dotfiles", in particular tmux.conf and .tmux.conf on the web for many good customizations to get you started.

Rajesh ,Mar 19, 2015 at 15:10

Go to PuTTy options --> Connection
  1. Change the default value for "Seconds between keepalives(0 to turn off)" : from 0 to 600 (10 minutes) --This varies...reduce if 10 minutes doesn't help
  2. Check the "Enable TCP_keepalives (SO_KEEPALIVE option)" check box.
  3. Finally save setting for session

,

I keep my PuTTY sessions alive by monitoring the cron logs
tail -f /var/log/cron

I want the PuTTY session alive because I'm proxying through socks.

[Dec 05, 2018] How can I scroll up to see the past output in PuTTY?

Dec 05, 2018 | superuser.com

Ask Question up vote 3 down vote favorite 1

user1721949 ,Dec 12, 2012 at 8:32

I have a script which, when I run it from PuTTY, it scrolls the screen. Now, I want to go back to see the errors, but when I scroll up, I can see the past commands, but not the output of the command.

How can I see the past output?

Rico ,Dec 13, 2012 at 8:24

Shift+Pgup/PgDn should work for scrolling without using the scrollbar.

> ,Jul 12, 2017 at 21:45

If shift pageup/pagedown fails, try this command: "reset", which seems to correct the display. – user530079 Jul 12 '17 at 21:45

RedGrittyBrick ,Dec 12, 2012 at 9:31

If you don't pipe the output of your commands into something like less , you will be able to use Putty's scroll-bars to view earlier output.

Putty has settings for how many lines of past output it retains in it's buffer.


before scrolling

after scrolling back (upwards)

If you use something like less the output doesn't get into Putty's scroll buffer


after using less

David Dai ,Dec 14, 2012 at 3:31

why is putty different with the native linux console at this point? – David Dai Dec 14 '12 at 3:31

konradstrack ,Dec 12, 2012 at 9:52

I would recommend using screen if you want to have good control over the scroll buffer on a remote shell.

You can change the scroll buffer size to suit your needs by setting:

defscrollback 4000

in ~/.screenrc , which will specify the number of lines you want to be buffered (4000 in this case).

Then you should run your script in a screen session, e.g. by executing screen ./myscript.sh or first executing screen and then ./myscript.sh inside the session.

It's also possible to enable logging of the console output to a file. You can find more info on the screen's man page .

,

From your descript, it sounds like the "problem" is that you are using screen, tmux, or another window manager dependent on them (byobu). Normally you should be able to scroll back in putty with no issue. Exceptions include if you are in an application like less or nano that creates it's own "window" on the terminal.

With screen and tmux you can generally scroll back with SHIFT + PGUP (same as you could from the physical terminal of the remote machine). They also both have a "copy" mode that frees the cursor from the prompt and lets you use arrow keys to move it around (for selecting text to copy with just the keyboard). It also lets you scroll up and down with the PGUP and PGDN keys. Copy mode under byobu using screen or tmux backends is accessed by pressing F7 (careful, F6 disconnects the session). To do so directly under screen you press CTRL + a then ESC or [ . You can use ESC to exit copy mode. Under tmux you press CTRL + b then [ to enter copy mode and ] to exit.

The simplest solution, of course, is not to use either. I've found both to be quite a bit more trouble than they are worth. If you would like to use multiple different terminals on a remote machine simply connect with multiple instances of putty and manage your windows using, er... Windows. Now forgive me but I must flee before I am burned at the stake for my heresy.

EDIT: almost forgot, some keys may not be received correctly by the remote terminal if putty has not been configured correctly. In your putty config check Terminal -> Keyboard . You probably want the function keys and keypad set to be either Linux or Xterm R6 . If you are seeing strange characters on the terminal when attempting the above this is most likely the problem.

[Dec 01, 2018] Lua editors WoWWiki FANDOM powered by Wikia

Dec 01, 2018 | wowwiki.wikia.com

[Nov 13, 2018] Resuming rsync partial (-P/--partial) on a interrupted transfer

Notable quotes:
"... should ..."
May 15, 2013 | stackoverflow.com

Glitches , May 15, 2013 at 18:06

I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning.

Here is my command:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"

When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 .

Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume.

How do I fix this so I don't have to manually intervene each time?

Richard Michael , Nov 6, 2013 at 4:26

TL;DR : Use --timeout=X (X in seconds) to change the default rsync server timeout, not --inplace .

The issue is the rsync server processes (of which there are two, see rsync --server ... in ps output on the receiver) continue running, to wait for the rsync client to send data.

If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9 ), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

JamesTheAwesomeDude , Dec 29, 2013 at 16:50

Just curious: wouldn't SIGINT (aka ^C ) be 'politer' than SIGTERM ? – JamesTheAwesomeDude Dec 29 '13 at 16:50

Richard Michael , Dec 29, 2013 at 22:34

I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives SIGINT via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) – Richard Michael Dec 29 '13 at 22:34

d-b , Feb 3, 2015 at 8:48

I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/ but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? – d-b Feb 3 '15 at 8:48

Cees Timmerman , Sep 15, 2015 at 17:10

@user23122 --checksum reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. – Cees Timmerman Sep 15 '15 at 17:10

[Nov 08, 2018] How to find which process is regularly writing to disk?

Notable quotes:
"... tick...tick...tick...trrrrrr ..."
"... /var/log/syslog ..."
Nov 08, 2018 | unix.stackexchange.com

Cedric Martin , Jul 27, 2012 at 4:31

How can I find which process is constantly writing to disk?

I like my workstation to be close to silent and I just build a new system (P8B75-M + Core i5 3450s -- the 's' because it has a lower max TDP) with quiet fans etc. and installed Debian Wheezy 64-bit on it.

And something is getting on my nerve: I can hear some kind of pattern like if the hard disk was writing or seeking someting ( tick...tick...tick...trrrrrr rinse and repeat every second or so).

In the past I had a similar issue in the past (many, many years ago) and it turned out it was some CUPS log or something and I simply redirected that one (not important) logging to a (real) RAM disk.

But here I'm not sure.

I tried the following:

ls -lR /var/log > /tmp/a.tmp && sleep 5 && ls -lR /var/log > /tmp/b.tmp && diff /tmp/?.tmp

but nothing is changing there.

Now the strange thing is that I also hear the pattern when the prompt asking me to enter my LVM decryption passphrase is showing.

Could it be something in the kernel/system I just installed or do I have a faulty harddisk?

hdparm -tT /dev/sda report a correct HD speed (130 GB/s non-cached, sata 6GB) and I've already installed and compiled from big sources (Emacs) without issue so I don't think the system is bad.

(HD is a Seagate Barracude 500GB)

Mat , Jul 27, 2012 at 6:03

Are you sure it's a hard drive making that noise, and not something else? (Check the fans, including PSU fan. Had very strange clicking noises once when a very thin cable was too close to a fan and would sometimes very slightly touch the blades and bounce for a few "clicks"...) – Mat Jul 27 '12 at 6:03

Cedric Martin , Jul 27, 2012 at 7:02

@Mat: I'll take the hard drive outside of the case (the connectors should be long enough) to be sure and I'll report back ; ) – Cedric Martin Jul 27 '12 at 7:02

camh , Jul 27, 2012 at 9:48

Make sure your disk filesystems are mounted relatime or noatime. File reads can be causing writes to inodes to record the access time. – camh Jul 27 '12 at 9:48

mnmnc , Jul 27, 2012 at 8:27

Did you tried to examin what programs like iotop is showing? It will tell you exacly what kind of process is currently writing to the disk.

example output:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
    2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
    3 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
    6 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
    7 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/0]
    8 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/1]
 1033 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [fl