Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Perl substr function

 News

Perl string operations

Recommended Links Recommended Articles Perl IDE Debugging Perl for Win32

Perl Warts

Perl in Vim Perl as a command line utility tool Perl modules Pipes in Perl Reimplementation of Unix tools Perl philosophy and history Reimplementation of Unix tools Perl power tools
grep & map sort Perl Split function sprintf  index and rindex tr    

Namespaces

Libraries

Modules

Networking

Tips Beautifiers Humor Etc

Substr is a classic string manipulation function that, as far as I know was first introduced in PL/1 in early 60th. This is the most important function for manipulating strings in Perl. One need to understand it to be effective Perl programmer.

Complete  understanding of the substr function is really important in order to become an effective Perl programmer

Like PL/1 Perl provides the substr function (substring) to extract parts of a scalar (e.g. string). In the most general case of substr invocation you need to specify four arguments:

  1. String to be used
  2. Starting position of the substring that you want to extract (can be negative
  3. Length of the substring

For example if you wanted to get the first character of a string:

 $name = "Nick"; 
 $initial = substr($name,0,1);
		    |	 | |_______ length
		    |	 |_________ starting position  
		    |______________ name of the string 

The second argument can be negative -- in this case the offset of the starting position will be calculated from the end of the string, but form the start of the string

$last=substr($name, length($name),1); # the last character 

$last= substr($name, -1,1); # same as above

Like in PL/1 and REXX omitting the last argument means that all characters till the end of the string (tail) will be taken.

$last=substr($name, -1) ; # same as above
$last-substr($name,10); # all characters of the string starting from position 10
If you want, you can also use substr function to replace any fragment of the string -- like in PL/1 and REXX substr can be used on the left side of the assignment statement (such functions are called pseudofunctions or R-value functions):
substr($name,0,1)=uc(substr,$name,0,1); # capitalize the first symbol like in ucfirst.

This pseudofunction or R-value capability of substr is very useful. For example we can also to chop off the last character from the scalar $current_line:

substr($name, -1) = '';  # will truncate the string $name by two letters 
This is actually more flexible then chop function as them the number of bytes we need to chop can be a variable:
substr($name, -$k) = ''; # will truncate the string $name by $k+1 letters

Here we used negative subscript to count backwards from the end of the string. You can achieve the same result using negative value of length parameter in substr function, which will be interpreted as length of the string minus this offset:

$name=substr($name, 0,-2);  # will truncate the string $name by two letters

Here you can see that in Perl substr functions the negative third argument was interpreted in a similar way to negative second argument -- as offset from length of the string (length($name)-2) .

If you note that substr($name,0,0) is the very beginning of the string it is clear that you can add prefix to the string using substr:

$name='bezroukov';
substr($name, 0,0)='Nick '; # will add the first name to the last

Another interesting idiom is the conversion of the first latter to upper case (kind of generalized uc function as we can convert not only the first letter but any number of letters in any part of the string. for example

substr($name,0,3) =~ tr/a-z/A-Z/; # convert the first three letters to uppercase

The important difference between substr in PL/1 and REXX and substr in Perl is that it can substitute a new string instead of deleted. This is a semi-useful generalization as it is borders on overcomplexity and there is a nice simple idiom using regular expressions for substitution (with search):

s/search_string/replacement_string/;
(see Chapter 5).

Here is a man entry for substr that describes this possibility (the bold in mine -NNB):

substr EXPR,OFFSET,LEN,REPLACEMENT
 
substr EXPR,OFFSET,LEN
 
substr EXPR,OFFSET
Extracts a substring out of EXPR and returns it. First character is at offset 0, or whatever you've set $[  to (but don't do that).

If OFFSET is negative (or more precisely, less than $[), starts that far from the end of the string.

If LEN is omitted, returns everything to the end of the string. If LEN is negative, leaves that many characters off the end of the string.

If you specify a substring that is partly outside the string, the part within the string is returned. If the substring is totally outside the string a warning is produced.

You can use the substr() function as an lvalue, in which case EXPR must itself be an lvalue. If you assign something shorter than LEN, the string will shrink, and if you assign something longer than LEN, the string will grow to accommodate it. To keep the string the same length you may need to pad or chop your value using sprintf().

An alternative to using substr()  as an lvalue is to specify the replacement string as the 4th argument. This allows you to replace parts of the EXPR and return what was there before in one operation, just as you can with splice().

As you can see Perl substr can use forth argument -- the "replacement string". This duplicates the ability to use substr at the left side of assignment statement and as such might well be considered as an example of useless or harmful "innovation". For example:

$string='abba';
substr($string,1,2)='vv'; # produced "avva".

is equivalent to

$string='abba';
substr($string,1,2,'vv'); # same thing

One marginally useful example when usage of forth argument makes some sense is inserting a substring from a certain position of the string, for example:

$a="world";
$b=substr($a,0,0,"Hello "); # note that the length can be different
print $a; # will print "Hello world"

For some unknown to me reason the substr function does not affect default variable $_:

$a='abba';
$_='';
substr($a,0,1);
print "$_\n"; # Will not print the first letter of the string

In some cases instead of substr you can use sprintf (see sprintf in Perl) It is convenient for example to put variables in a predefined places in dynamically generated command. For example there are some difficulties on working with UNIX permissions as they are octal and can be mangled if Perl converts them into decimal, so using sprintf in this case is simpler:

$perm=0755;
$string = sprintf ("/bin/chmod %o $target/*", $perm);
`$string`;

We will discuss sprintf in more details in sprintf in Perl


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Jul 30, 2015] Manipulating a Substring with substr (Learning Perl, 3rd Edition)

The substr operator works with only a part of a larger string. It looks like this:
$part = substr($string, $initial_position, $length);

It takes three arguments: a string value, a zero-based initial position (like the return value of index), and a length for the substring. The return value is the substring:

my $mineral = substr("Fred J. Flintstone", 8, 5);  # gets "Flint"
my $rock = substr "Fred J. Flintstone", 13, 1000;  # gets "stone"

As you may have noticed in the previous example, if the requested length (1000 characters, in this case) would go past the end of the string, there's no complaint from Perl, but you simply get a shorter string than you might have. But if you want to be sure to go to the end of the string, however long or short it may be, just omit that third parameter (the length), like this:

my $pebble = substr "Fred J. Flintstone", 13;  # gets "stone"

The initial position of the substring in the larger string can be negative, counting from the end of the string (that is, position -1 is the last character). In this example, position -3 is three characters from the end of the string, which is the location of the letter i:

[334]This is analogous to what we saw with array indices in Chapter 3, "Lists and Arrays ". Just as arrays may be indexed either from 0 (the first element) upwards or from -1 (the last element) downwards, substring locations may be indexed from position 0 (at the first character) upwards or from position -1 (at the last character) downwards.

my $out = substr("some very long string", -3, 2);  # $out gets "in"

As you might expect, index and substr work well together. In this example, we can extract a substring that starts at the location of the letter l:

my $long = "some very very long string";
my $right = substr($long, index($long, "l") );

Now here's something really cool: The selected portion of the string can be changed if the string is a variable:

[335]Well, technically, it can be any lvalue. What that term means precisely is beyond the scope of this book, but you can think of it as anything that can be put on the left side of the equals sign (=) in a scalar assignment. That's usually a variable, but it can (as you see here) even be an invocation of the substr operator.

my $string = "Hello, world!";
substr($string, 0, 5) = "Goodbye";  # $string is now "Goodbye, world!"

As you see, the assigned (sub)string doesn't have to be the same length as the substring it's replacing. The string's length is adjusted to fit. Or if that wasn't cool enough to impress you, you could use the binding operator (=~) to restrict an operation to work with just part of a string. This example replaces fred with barney wherever possible within just the last twenty characters of a string:

substr($string, -20) =~ s/fred/barney/g;

To be completely honest, we've never actually needed that functionality in any of our own code, and chances are that you'll never need it either. But it's nice to know that Perl can do more than you'll ever need, isn't it?

Much of the work that substr and index do could be done with regular expressions. Use those where they're appropriate. But substr and index can often be faster, since they don't have the overhead of the regular expression engine: they're never case-insensitive, they have no metacharacters to worry about, and they don't set any of the memory variables.

Besides assigning to the substr function (which looks a little weird at first glance, perhaps), you can also use substr in a slightly more traditional manner with the four-argument version, in which the fourth argument is the replacement substring:

By traditional we mean in the "function invocation" sense, but not the "Perl" sense, since this feature was introduced to Perl relatively recently.

my $previous_value = substr($string, 0, 5, "Goodbye");

The previous value comes back as the return value, although as always, you can use this function in a void context to simply discard it.


Recommended Links

Google matched content

Softpanorama Recommended

Internal

External



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: July, 30, 2015