NAME

Learning RPerl

COPYRIGHT

Learning RPerl is Copyright © 2013, 2014, 2015, 2016, 2017, 2018 William N. Braswell, Jr. All Rights Reserved.

Learning RPerl is part of the RPerl Family of software and documentation.

BOOK TITLE

Learning RPerl

~ or ~

Let's Write Fast Perl!

~ affectionately known as ~

The Roadrunner Book

.
...
.....
.......
being
.......
.....
...
.

The Official Introductory-Level Reference, User Manual, and Educational Documentation

~ for ~

Restricted Perl, The Optimizing Perl 5 Compiler


DEDICATION

For Anna.


EDITION

0th Edition, Pre-Release Copy

Automatically Generated From RPerl/Learning.pm v0.195
Using pod2rperlhtml.pl v0.027
On Friday, April 19th, 2019 at 9:55pm CDT


TABLE OF CONTENTS: CHAPTERS AT-A-GLANCE


FOREWORD

[ INSERT FOREWORD CONTENT HERE ]


PREFACE

Section 0.1: Who I Am

My name is William N. Braswell, Jr.; I am also known by a number of other names including Will the Chill, The Voice In The Wilderness, Skipper Brassie ("braz-ee"), and just Will.

I have a degree in computer science and mathematics from Texas Tech University, I have worked as a Perl software developer for over 15 years, and I am the founder of the Auto-Parallel Technologies consulting company.

LinkedIn Profile

GitHub Profile

I am 1 of 3 co-founders of the Perl 11 movement.

Perl11.org

I am also the President of the Austin Perl Mongers.

Austin.pm

Most importantly, I am the creator of the RPerl optimizing compiler for Perl 5, about which you are currently reading!

RPerl.org

Section 0.2: Why I Wrote This Book

Using RPerl is different enough from normal Perl 5 to necessitate the creation of in-depth user documentation.

Manual pages and cookbooks and example source code alone are not enough, only a full textbook can provide the level of detail necessary to truly learn RPerl.

This is that textbook.

Section 0.3: History Of This Book

RPerl v1.0 was released on US Independence Day, July 4th, 2015; 6 days later, work began on the source code solution to exercise 1 of chapter 1 of this book:

First Learning RPerl Commit on GitHub

Significant GitHub commit dates may be viewed by forking the RPerl git repository and executing the following git command:

    $ git log --reverse --all --date=short --pretty='%cd: %s' | grep 'Learning RPerl'
    2015-07-10: Learning RPerl, Chapter 1, Exercise 1, Hello World
    ...

Section 0.4: TPF Grants

This book was made possible in part by 2 generous grants from The Perl Foundation, as part of the September 2015 and January / February 2016 rounds of funding.

Special thanks to TPF Grants Committee Secretary, Makoto Nozaki; TPF Grant Manager, Mark Jensen; TPF Grants Committee's various supporting members; and everyone who gave positive feedback on the grant proposals.

A history of TPF grant #1 may be found at the following links:

A history of TPF grant #2 may be found at the following links:

Section 0.5: Acknowledgements & Thanks

Countless people have contributed to the development of RPerl; from source code to bug testing to financial donations to emotional support, it truly takes a village to build a compiler!

Below are the contents of the official RPerl thank-you file, listing the handles (online nicknames) and names of the most important RPerl contributors. If you don't already know who these people are, you will be pleasantly surprised by researching each of them.

Latest THANKS File

    Many Thanks To irc.perl.org:

    #perl5 Founder timtoady

    #perl6 Founder timtoady (again)

    #perl11 Founders ingy & rurban & willthechill (yours truly)

    #perl11 Members bulk88 & mst

    #inline Founders ingy (again) & nwatkiss

    #inline Members davido & mohawk & sisyphus


    Additional Thanks To:

    Eyapp Creator Casiano Rodriguez-Leon, PhD

    Austin Perl Mongers

    All RPerl Contributors & Users & Supporters

Section 0.6: Defense / Apology

I'm sure I will make errors while writing this book.

I may even upset some people, particularly those who have an emotional or financial investment in slow Perl software.

Despite my best efforts, I remain a fallible human being; thus, bad spelling and grammer and run-on sentences and parts that are hard to understand and parts that are not funny and formattiNg errors and bad spelling adn repetitions and other annoyances will doubtless plague this tome but we must not allow such trivialities as, improper punctuation to affect our willingness and ability to learn how to write super-fast RPerl software.

If you find a mistake in this book (other than in the immediately preceding paragraph), please utilize the following link to create a new GitHub issue (bug report) using a title starting with the words "Learning RPerl":

New GitHub Issue

I will try my best to create an engaging and educational experience for you, the reader; however, in anticipation of the inevitable disappointment you may experience, I can only humbly offer...

I'M SORRY!

Section 0.7: POD

"The Pod format is not necessarily sufficient for writing a book."

http://perldoc.perl.org/perlpod.html

~ Saint Larry Wall & Sean M. Burke


"Challenge accepted."

https://github.com/wbraswell/rperl/blob/master/script/development/pod2rperlhtml.pl

~ Will Braswell


CHAPTER 1: INTRODUCTION

Section 1.1: Welcome To The Roadrunner Book!

You are about to learn the basic concepts of writing software using the RPerl optimizing compiler for the Perl computer programming language. With the skills gained by reading this book, you will be empowered to create new super-fast RPerl programs which can be intermixed with the enormous amount of existing Perl software available on the Internet.

This book is named and stylized for the animal mascot for RPerl, Roadie the Roadrunner. RPerl, like Roadie, "runs really fast".

Throughout this text, the following 14 typography conventions are utilized:

Section 1.2: Learning Perl

This book is purposefully patterned after the popular educational text Learning Perl, affectionately known as the Llama Book. Both the Roadrunner Book and the Llama book are meant as introductory texts on Perl topics. The Llama Book is focused on normal Perl, and the Roadrunner Book is focused on optimized Perl.

This book copies the same chapter topics as Learning Perl, but all content is re-written for RPerl. Learning RPerl also copies the same exercise concepts as Learning Perl, but all solutions are re-written in RPerl. Both books are canonical and may be used together in the classroom; the source code solutions are meant to be compared side-by-side as textbook examples of normal Perl versus optimized Perl.

Please support the Perl community by purchasing a copy of Learning Perl, 7th Edition from our friends at O'Reilly:

http://shop.oreilly.com/product/0636920049517.do

Section 1.3: Is This Book A Good Choice For You?

If you answered "yes" to any of these questions, then the Roadrunner Book is definitely for you!

If you answered "no" to all of these questions, then this book may still be for you, give it a try!

If you hate Perl, or only love slow software, or wish all computers would explode, then we suggest some soul-searching and a few Saint Larry videos. You'll thank us in the morning.

Section 1.4: Why Aren't There More Footnotes?

This is a purposefully simple book, in the same way RPerl is a purposefully simple subset of the full Perl 5 programming language.

Section 1.5: What About Educational Programming Exercises?

There are one or more programming exercises at the end of every chapter, and full answers to each problem are given near the end of the book in Appendix A.

For maximum educational effect, we suggest you attempt to write each piece of code on your own before looking at our solutions.

If you are using this as an official textbook for certification or academic credit, such as at LAMPuniversity.org or a traditional school, you are obviously expected to write all your own code without referring to our or anyone else's solutions whatsoever. We suggest you enclose Appendix A with a paper clip or discard it altogether to avoid the potential for accidental academic dishonesty.

Section 1.6: How Long Should Each Exercise Take To Complete?

The numbers at the beginning of each exercise indicate the approximate number of minutes required for an average person to reach a full working solution. If it takes you less time, good for you! If it takes you more time, don't worry, it's no big deal; learning technical skills requires time and dedication. All experts were once novices.

Section 1.7: What If I Want To Teach RPerl?

Thank you for helping spread the love of Perl and the speed of RPerl!

As previously mentioned, this book may either be used solo or combined with the Llama Book. For students who are not already familiar with Perl, you may wish to use this text alone in order to simplify and ease the learning experience. For students who are already familiar with Perl or other dynamic programming languages like the snake or the red gemstone, you may wish to use both textbooks for a more in-depth compare-and-contrast approach.

Section 1.8: What Does The Name RPerl Actually Mean?

RPerl stands for "Restricted Perl", in that we restrict our use of Perl to those parts which can be made to run fast. RPerl also stands for "Revolutionary Perl", in that we hope RPerl's speed will revolutionize the software development industry, or at least the Perl community. RPerl might even stand for "Roadrunner Perl", in that it runs really fast.

Section 1.9: Why Did Will Invent RPerl?

Will loves Perl and the Perl community.

Will is a scientist and needs his code to run really fast.

Will doesn't like the hassle of writing code in C or C++ or XS or Inline::C or Inline::CPP.

Will waited a decade or two before realizing he had to do it himself.

Section 1.10: Why Didn't Will Just Use Normal Perl?

Dynamic languages like Perl are fast at running some kinds of computational actions, such as regular expressions (text data pattern matching) and reading from a database.

Unfortunately, dynamic languages are slow at running general-purpose computations, such as arithmetic and moving data around in memory. Sometimes very slow.

Dynamic languages like Perl are also flexible, powerful, and relatively easy to learn. Sometimes too flexible.

RPerl's goal is to keep all of Perl's power and ease-of-use, while removing the redundant parts of Perl's flexibility in order to gain a major runtime speed boost.

The most complex and flexible parts of Perl are called "high magic", so RPerl is focused on supporting the "low magic" parts of Perl which can be made to run fast.

Section 1.11: Is RPerl Simple Or Complicated?

RPerl is specifically designed to remove the confusing and overly-complicated parts of Perl.

RPerl also introduces a number of additional rules and templates which are not present in normal Perl, notably including the use of real data types.

The net effect of removing Perl complexity and adding RPerl rules falls in favor of RPerl, due primarily to the exceedingly complex nature of Perl.

In other words, RPerl is easier to learn and use than dynamic languages like normal Perl, and most any other language in general.

Section 1.12: How Is RPerl Being Promoted?

The RPerl team has been regularly promoting RPerl in a number of physical and digital venues, including but not limited to:

Section 1.13: What Is The RPerl Community Up To?

As of US Independence Day 2016, RPerl v2.0 (codename Pioneer) has been publicly released and is in use by a number of early adopters around the world.

RPerl development is proceeding with financial support from both Kickstarter crowdfunding and official grant monies from The Perl Foundation.

The RPerl community is beginning to grow, and there are a number of exciting RPerl projects currently in the works.

If you would like to create software libraries and applications (AKA "programs" or "apps") to be utilized by end-users, then please join the RPerl application developers group, also known as the "RPerl App Devs":

RPerl App Devs Group On Facebook

RPerl App Devs Intake Board On Trello

Section 1.14: What Is RPerl Meant To Do?

RPerl is a general-purpose programming language, which means you can use RPerl to efficiently and effectively implement virtually any kind of software you can imagine.

RPerl is especially well-suited for building software which benefits from speed, such as scientific simulations and graphical video games.

RPerl is also good for building software which utilizes Perl's strong-suit of string manipulation; RPerl currently supports basic string operators, with full regular expression support to be added in an upcoming version.

Section 1.15: What Is RPerl Not Meant To Do?

RPerl has purposefully disabled the most complex features of Perl, such as runtime code evaluation, secret operators, and punctuation variables. If you have purposefully designed your Perl software to depend on these high-magic features, or you are unconditionally committed to continue using high-magic language features, then maybe RPerl isn't for you.

Section 1.16: How Can I Download & Install RPerl?

Installing RPerl ranges from easy to difficult, depending on how well your operating system supports Perl and C++.

On modern operating systems with good Perl support, such as Debian or Ubuntu GNU/Linux, you should be able to install RPerl system-wide by running the following command at your terminal command prompt:

    $ sudo cpan RPerl

You may also choose to use the `cpanm` command for simplicity, or the local::lib tool for single-user (not system-wide) installation, both of which are included in the INSTALL notes document linked below.

If RPerl is properly installed, you should see a short text message displayed when you type the following command:

    $ rperl -v

On operating systems with less Perl support, you may have to perform a number of steps to successfully install RPerl, with dry technical detail available in the INSTALL notes document:

https://github.com/wbraswell/rperl/blob/master/INSTALL

Unless you are an experienced programmer or system administrator, it is strongly recommended you use the Xubuntu operating system. You can download the Xubuntu ISO file at the link below, then use it to create a bootable DVD disc or USB flash drive, install Xubuntu onto any computer, and issue the `sudo cpan RPerl` command as described above.

http://xubuntu.org/getxubuntu

If you are interested in viewing the source code of RPerl itself, you may find the latest major release of RPerl (stable) on CPAN:

https://metacpan.org/author/WBRASWELL

You may find the latest development release of RPerl (possibly unstable) on Github:

https://github.com/wbraswell/rperl

Section 1.17: Where Is Perl Software Stored Online?

CPAN is the "Comprehensive Perl Archive Network", the world's most successful and mature centralized software network.

CPAN servers are where most public Perl software is stored, including RPerl.

https://en.wikipedia.org/wiki/CPAN

http://www.cpan.org

Several other programming language communities have copied the success and implementation of CPAN, including JSAN for Javascript, CRAN for R, and CCAN for C.

Section 1.18: How Can I Obtain Technical Support For RPerl?

Official RPerl technical support is provided through Auto-Parallel Technologies, Inc.

To request more information, please send an e-mail to the following address:

william DOT braswell AT autoparallel DOT com

Section 1.19: Are There Any Free Technical Support Options?

Free technical support for non-commercial users is provided by the RPerl community through Internet Relay Chat.

Server: irc.perl.org

Channel: #perl11

Easy Web Chat: http://irc.lc/magnet/perl11/rperl_newbie@@@

Section 1.20: Are There Any Bugs In RPerl?

All software has small (or large) problems called "bugs", and depending on who is marketing the software, they may even tell you some of the bugs are actually "features"!

RPerl is a work in progress, and may contain a number of bugs, both known and unknown. If you find a bug in RPerl, we would love to hear about it!

The primary bug-tracking platform for RPerl is Github Issues, where you may file a new bug report ("new issue") if it is not already listed:

https://github.com/wbraswell/rperl/issues

Please be sure to include all of the following information in your bug report:

Although Github Issues is strongly preferred, the RPerl development team also supports the legacy CPAN ticket system:

https://rt.cpan.org/Public/Dist/Display.html?Name=RPerl

Section 1.21: How Can I Write A Program Using RPerl?

Computer programs are written in a human-readable format called "source code". Source code is stored as plain text data inside normal computer files. These are the same kind of text files which you may be already familiar with as ending in the .txt file suffix. The only difference is we simply choose a different file name suffix to identify these specific text files as containing source code. Thus, programs written using the RPerl language are plain text files, which means you can use any text editor to create and modify your RPerl source code. Examples of common text editors include Notepad, Pico, and Vi.

http://www.vim.org

To avoid possible file format problems, do not edit your RPerl programs using a word processor such as Wordpad, Word, OpenOffice, or LibreOffice.

Experienced RPerl developers may choose to utilize an "integrated development environment" (IDE), which is a special text editor made for writing software. Examples of common Perl IDE applications include Eclipse EPIC, Padre, and Komodo (non-free).

http://www.epic-ide.org

http://padre.perlide.org

http://komodoide.com/perl

Section 1.22: A Sample RPerl Program

    #!/usr/bin/env perl
    
    # Foo Bar Arithmetic Example
    
    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;
    
    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    
    # [[[ OPERATIONS ]]]
    my integer $foo = 21 + 12;
    my integer $bar = 23 * 42 * 2;
    my number  $baz = to_number($bar) / $foo;
    print 'have $foo = ', to_string($foo), "\n";
    print 'have $bar = ', to_string($bar), "\n";
    print 'have $baz = ', to_string($baz), "\n";

Section 1.23: What Are The Parts Of That Sample RPerl Program?

This program is separated by blank lines into 4 sections: shebang, header, critics, and operations.

Other than the shebang and critics, all lines beginning with # are comments and can be safely ignored or discarded without affecting the program.

The "shebang" section is required, always contains exactly 1 line, and is short for "hash bang"; referring to the two leading characters #! of this line. The "octothorpe" character # (tic-tac-toe symbol) is called a "pound-sign" when used on a telephone, and is called a "hash" (or more recently and less accurately "hash-tag") when used on a computer. The exclamation-point character ! is called a "bang" when used on a computer. When appearing together as the first two characters in a plain text file, the hash and bang characters tell the operating system to run the immediately-following command (in this case the Perl interpreter invoked by /usr/bin/env perl) and pass the remaining contents of the text file as input to the command. (Traditionally, the Perl interpreter has been located at /usr/bin/perl, however we must practice prudence by asking the operating system environment where our Perl interpreter is actually located, in order to support those cases where Perl may be installed in a non-standard directory, such as with Perlbrew or manually-built Perl interpreters. Please see "Section 1.24: How Do I Run The RPerl Compiler?" for more information.) In other words, if the first line of a plain text file is #!/usr/bin/env perl (or #!/usr/bin/perl or something similar), then that file is a Perl program.

The "header" section is required and always contains 4 lines for an RPerl "program" file ending in .pl, or 5 lines for an RPerl "module" ending in .pm (covered later in Chapter 11). use is recognized by Perl as a special "keyword" (also a Perl "function") which has 2 primary purposes: to load additional RPerl modules, and to change RPerl settings called "pragma" settings. (Please see "Section 4.6: use strict; use warnings; Pragmas & Magic" for more information about "pragma" system configuration modes.) The use RPerl; line is dual-purpose, it both loads the RPerl.pm module and enables the special RPerl low-magic pragma. (Please see "Section 1.25.4: The Low-Magic Perl Commandments" for more information about low magic versus high magic.) The use strict; and use warnings; lines enable basic Perl pragmas which require decent programming practices by the human programmers. The our $VERSION = 0.001_000; line sets the version number of this RPerl program.

The "critics" section aids in the proper detection of errors in your RPerl source code. This section of source code is only included as necessary, and may contain 1 or more lines beginning with ## no critic, which disable the errors caused by the over-restrictive nature of some Perl::Critic policies. There are currently 6 critics commands enabled for normal RPerl users, the first 2 of which are given in this example. The "USER DEFAULT 1" no critic command allows the use of numeric values such as 21 and 12, as well as the common print command. The USER DEFAULT 2 critics command allows the printing of 'have $foo = ', where a single-quoted ' string literal value contains the the dollar-sign $ sigil (covered later in Chapter 2).

The "operations" section is required and contains 1 or more lines of general-purpose RPerl source code. This is the main body of your program. The 6 lines of source code in our example are used to perform some simple arithmetic and display the results. The my integer $foo = 21 + 12; line declares a new "variable" named $foo. A variable is simply a place to store your individual pieces of data, identified by a name of your choice. (Please see "CHAPTER 2: SCALAR VALUES & VARIABLES (NUMBERS & TEXT)" for more information about variables.) Every variable in RPerl must be defined as containing exactly one type of data, such as whole numbers or decimal numbers or characters or strings, etc. (In normal Perl, there are no data types.)

The variable $foo is defined as containing the data type "integer", which is another word for whole numbers, such as 0 or 1 or -5_126. $foo is initialized to contain the arithmetic result of numeric literal values 21 plus 12.

The my integer $bar = 23 * 42 * 2; line does much the same thing, creating a new numeric variable named $bar and initialized with 23 times 42 times 2.

The my number $baz = to_number($bar) / $foo; line creates a new variable named $baz. This variable contains floating-point numeric data, which is simply another term for decimal numbers such as 0.1 or 1.0 or -5_126.34. The variable $baz is initialized to contain the quotient of the $bar and $foo variables. The to_number() RPerl type conversion "subroutine" converts a non-floating-point integer value to a floating-point number value. (A subroutine is a user-defined operation, in this case pre-defined by the RPerl development team for your convenience; please see "CHAPTER 4: ORGANIZING BY SUBROUTINES" for more information.)

The print 'have $foo = ', $foo, "\n"; and following 2 lines will display on screen (not send to paper printer) the labeled values of $foo, $bar, and $baz respectively. The , comma is used to separate multiple "arguments" passed to the print operator. An "argument" is simply a piece of data provided as input to an operation. For example, the number 23 and the variable $foo are arguments to the addition operation in the source code 23 + $foo. In the context of source code operations, the terms "operand" and "parameter" mean the same thing as "argument". (Please see "Section 1.24: How Do I Run The RPerl Compiler?" for the similar-yet-different definition of "argument" in the context of command-line programs.)

The to_string() RPerl type conversion subroutine converts the numeric values to underscore-formatted string values, suitable for use via the print operator. If the to_string() subroutine is not used, then the displayed numeric values will still be human-readable, but will not contain the proper underscores to be accepted back into RPerl as valid numeric data. The "n" in the "\n" double-quoted strings stands for "newline", which places the next piece of printed data down on the following line.

Section 1.24: How Do I Run The RPerl Compiler?

Normal Perl source code is executed using a software mechanism known as "interpretation", which is to say that Perl is an "interpreted" language and the /usr/bin/perl command is called the "Perl interpreter". The primary alternative to interpretation is "compilation". An interpreted computer program is converted from human-readable source code into computer-readable binary code one line at a time, and each line of code is executed before the following line is converted. On the other hand, a compiled computer program is converted from source code into binary code all at once, then the resulting executable binary file can be run as many times as you like without further conversion. So, RPerl is a "compiled" subset of the Perl language and the /usr/bin/rperl command is called the "RPerl compiler".

Like the Perl interpreter, the RPerl compiler accepts 2 different source code file types as input: Perl programs which end in .pl and Perl modules which end in .pm. Perl program files actually run and execute actions, optionally receiving some functionality from 1 or more Perl module files if specified. Perl modules do not run or execute actions themselves, they only provide functionality which must in turn be called from a Perl program, or from another Perl module which eventually gets called by a Perl program.

When you run the RPerl compiler program via the /usr/bin/rperl command, you may optionally provide input arguments which specify the many possible configuration settings to be used by RPerl while it is running. Similarly, you may provide input arguments to most other command-line programs, such as Perl via the /usr/bin/perl command, etc. In the context of command-line programs, the terms "command-line option" and "argument" and "command-line argument" mean the same thing as "option". (Please see "Section 1.23: What Are The Parts Of That Sample RPerl Program?" for the similar-yet-different definition of "argument" in the context of source code operations.)

A list of all valid RPerl compiler options may be seen by issuing the following command:

    $ rperl -?

You may find the same information by viewing the following links:

rperl

https://metacpan.org/pod/distribution/RPerl/script/rperl

To compile-then-execute the preceding RPerl example program, you may copy and paste the entire program (from shebang to second print) into a temporary file such as /tmp/foobar.pl, then execute the following command:

    $ rperl /tmp/foobar.pl

The output of this example program should be:

    have $foo = 33
    have $bar = 1_932
    have $baz = 58.545_454_545_454_5

If the compilation is successful, a new compiled executable file will be created in /tmp/foobar. You may then directly execute the compiled program as many times as you like, without needing to recompile it using the `rperl` command, and you should receive the exact same output as the non-compiled code:

    $ /tmp/foobar
    have $foo = 33
    have $bar = 1_932
    have $baz = 58.545_454_545_454_5

Please see "CHAPTER 11: CLASSES, PACKAGES, MODULES, LIBRARIES" for more information about compiling Perl modules.

Section 1.25: A Quick Overview of RPerl

Section 1.25.1: Creator Of RPerl, Will Braswell

Will Braswell does more than just create Perl compiler software, he is also very active in several other areas of life, including but not limited to:

These areas of interest are reflected in the tone and intention of RPerl.

Section 1.25.2: History Of RPerl

The RPerl project officially began as a New Year's Resolution on January 1st, 2013. Following the grand tradition of Perl creator "Saint" Larry Wall, RPerl version releases are often timed to coincide with major holidays.

After 1 year of work, RPerl v1.0beta1 was released on New Year's Day 2014, eventually followed by RPerl v1.0beta2 on Christmas 2014.

The much-anticipated RPerl v1.0 full release was made on US Independence Day 2015, and RPerl v1.2 came on Halloween 2015.

RPerl v1.3 was released on Thanksgiving 2015, followed by RPerl v1.4 on Christmas 2015, and so forth.

RPerl v1.0 was funded through a Kickstarter campaign, then RPerl v1.2 and v1.3 were funded through a second Kickstarter campaign. Work on the first 6 chapters of this book was funded, in part, by grants from The Perl Foundation.

RPerl v2.0 was released on US Independence Day 2016, exactly 1 year after v1.0 was released, in order to establish a regular annual release cycle.

Section 1.25.3: Performance Of RPerl

The question of "How fast is RPerl?" does not have one simple answer; instead there are several factors and configuration modes to be taken into consideration. A relatively detailed description of the performance and modes may be found at the following link:

http://rperl.org/performance_benchmarks.html

The most condensed answer is that "RPerl is really fast." Utilizing RPerl's fastest execution modes, we see performance very close to the highly-optimized C++ programming language, which means RPerl is now among the short list of "world's fastest languages" along with C, C++, and Fortran.

Section 1.25.4: The Low-Magic Perl Commandments

The high-magic features of Perl are primarily responsible for the slow speed at which Perl executes general-purpose computations. The "R" in RPerl stands for "Restricted", in that we restrict ourselves to only use the low-magic features of Perl which can run really fast.

The definitive list of do's and do-not's for high-magic vs low-magic Perl programming is called The Low Magic Perl Commandments (LMPC). There are 64 total commandments split into 5 groups of Ideals, Magic, Data, Operations, and Object-Orientation. The "Thou Shalt" commandments appear in the left column, and the "Thou Shalt Nots" appear on the right.

http://rperl.org/the_low_magic_perl_commandments.html

The LMPC draw inspiration from, and (wherever possible) work together with Damian Conway's Perl Best Practices and Jeffrey Thalhammer's Perl::Critic software.

http://shop.oreilly.com/product/9780596001735.do

http://search.cpan.org/~thaljef/Perl-Critic/lib/Perl/Critic/PolicySummary.pod

Section 1.25.5: Perlism & The Book Of RPerl

Perlism is the computer religion dedicated to the use, promotion, and development of the Perl family of programming languages. (Not to be confused with a spiritual religion such as Christianity, a computer religion such as Perlism is an independent and complementary belief structure.)

A Perlite is an adherent to the Perlism religion. Perlism has a revered founder, Saint Larry (himself a devout Christian); a prophet, The Voice In The Wilderness (Will); a monastary and shrine, Perl Monks; commandments, The LMPC; proverbs from Saint Larry including TIMTOWTDI, LMFB, and HTAAOF; and canonical scriptures, including Saint Larry's Apocalypses and The Voice's The Book Of RPerl.

The Book is a description of events surrounding the creation of RPerl and the future of the Internet. It is intended to both educate and entertain.

http://rperl.org/the_book_of_rperl.html

Section 1.25.6: Fun With Proverbs & Catch Phrases & Acronyms

St. Larry has given us short and powerful proverbs, some of which are meant to have a purposefully tongue-in-cheek or sarcastic interpretation.

Will has provided a corollary to each of St. Larry's official proverbs.

St. Larry's Original Proverb

Will's Corollary Proverb

3 Great Virtues Of A Programmer:

LIH (Laziness, Impatience, Hubris)

3 Greater Virtues Of A Programmer:

DPH (Diligence, Patience, Humility)

TIMTOWTDI (There Is More Than One Way To Do It)

TDNNTBMTOWTDI (There Does Not Need To Be More Than One Way To Do It)

TIOFWTDI (There Is One Fastest Way To Do It)

LMFB (Let Many Flowers Bloom)

PTBF (Pick The Best Flowers)

HTAAOF (Have The Appropriate Amount Of Fun)

DTAAOW (Do The Appropriate Amount Of Work)

In addition to St. Larry's official proverbs, there are a number of other commonly-used catch phrases and ideas in the Perl community.

Original Catch Phrase

Will's Corollary Catch Phrase

Perl 5 Is The Camel

Perl 5 Is The Raptor

Perl 6 Is The Butterfly

RPerl Is The Roadrunner

Perl Is The Onion

RPerl Is The Scallion

Perl Is The Swiss Army Chainsaw

RPerl Is The Sword

Perl Is Line-Noise

Perl Is A Write-Only Language

RPerl Is Best Practices

Section 1.26: What's New In RPerl v2.0?

The single most significant new feature included in RPerl v2.0 is automatic parallelization. This long-awaited software feature was promised from the very beginning of RPerl's initial development, with RPerl v2.0 being the originally-designated target for release of auto-parallel capabilities. We stuck to the plan and delivered on time: the 4th of July, 2016.

Automatic parallelization is now enabled on 4 parallel CPU cores by default, because quad-core CPUs are common at this time. You may utilize the --num_cores=8 command-line argument to double the default number of parallel cores, for example. (Please see "B.18: Modes, Parallelize" and "B.19: Modes, Parallelize, Number Of Cores" for more information about auto-parallelization arguments.)

Currently, shared memory parallel hardware platforms are supported, such as multi-core CPUs and supercomputers, by utilizing the OpenMP parallelization software. In the near future we will add support for distributed memory platforms, such as clusters and the cloud, by utilizing the MPI parallelization software, as well as GPUs and other specialty hardware by utilizing the OpenCL parallelization software.

RPerl triggers auto-parallelization by simply including the word 'PARALLEL' in a loop label; everything inside that loop will be automatically parallelized, including multiply-nested loops. RPerl implements the polytope model (AKA polyhedral model) for loop parallelization, by utilizing the Pluto PolyCC polytope software.

In addition to auto-parallelization, a number of other new features were released after RPerl v1.0 and by-or-before v2.0, including but not limited to:

Section 1.27: Exercises

1. Hello World [ 15 mins ]

On a computer with RPerl already installed, create a directory named LearningRPerl containing a sub-directory named Chapter1. Using the Foo Bar example program as a template, manually type a new RPerl program into a file named exercise_1-hello_world.pl inside the LearningRPerl/Chapter1 sub-directory. The sole purpose of your first program is to use the print operator and simply display the following one line of text output, followed by one newline character:

    Hello, World!

Run your new program by issuing the following command at your terminal command prompt:

    $ rperl -t LearningRPerl/Chapter1/exercise_1-hello_world.pl

HINT: You only need the "USER DEFAULT 1" no critic command, so your resulting program should be 7 lines long, not counting comments or blank lines.

2. RPerl Commands [ 15 mins ]

First, run the following RPerl command, and observe the output for use in 2a and 2b below:

    $ rperl -?

2a. What are some RPerl command-line options with which you are already familiar?

2b. With which options are you unfamiliar?

Next, run the following 3 RPerl commands, for 2c and 2d below:

    $ rperl -t -V LearningRPerl/Chapter1/exercise_1-hello_world.pl
    $ rperl -t -D LearningRPerl/Chapter1/exercise_1-hello_world.pl
    $ rperl -t -V -D LearningRPerl/Chapter1/exercise_1-hello_world.pl

2c. How do the outputs of these 3 commands differ from the output of Exercise 1?

2d. How do the outputs differ from one another?

3. Foo Bar Arithmetic [ 15 mins ]

Manually type the entire Foo Bar Arithmetic example program into a file named exercise_3-foo_bar_arithmetic.pl inside the LearningPerl/Chapter1 sub-directory. (Even if you have already used copy-and-paste on the Foo Bar Arithmetic example program, you should still use this as an opportunity to build some RPerl muscle memory and type it in by hand.)

Modify your program by adding a new floating-point numeric variable named $zab, set its value to $foo divided by $bar (don't forget to_number()), change the starting value of $bar, and use print to generate the following output:

    have $foo = 33
    have $bar = 966
    have $baz = 29.272_727_272_727_3
    have $zab = 0.034_161_490_683_229_8

Run your program thusly:

    $ rperl -t LearningRPerl/Chapter1/exercise_3-foo_bar_arithmetic.pl


CHAPTER 2: SCALAR VALUES & VARIABLES (NUMBERS & TEXT)

Most programming languages include the basic principle of using named "variables" to store data values such as numbers, text strings, and lists of multiple numbers or strings. As stated in "Section 1.23: What Are The Parts Of That Sample RPerl Program?" above, a "variable is simply a place to store your individual pieces of data, identified by a name of your choice". The term "variable" is a reference to the fact that the value stored inside a variable may "vary" according to the needs of the programmer. If a stored value will never change, then you should use a "constant" instead of a variable. (Please see section "Section 2.5: Constant Data" for more information about constants.)

Multiple variables may be created in a computer program, each with a different name such as $foo or $bar or $baz, and each potentially containing a different value. You can choose whatever name you want for each variable you create, such as the following three variables:

    my integer $some_whole_number   = 23;
    my number  $some_decimal_number = 21.12;
    my string  $some_funny_word     = 'howdy';

A single piece of data, such as one number or one string, is called a "scalar". Multiple pieces of data combined into a single aggregate structure may be either an "array" or a "hash", described in chapters 3 and 6, respectively. (Although sharing the same terminology, the hash data structure is not related to the hash # tic-tac-toe character.) In normal Perl, only scalar variable names begin with the dollar-sign $ "sigil", while aggregate data structures are stored in variables starting with different sigils like at-sign @ or percent-sign %. A sigil is simply a special character prefixed to a word, in order to help us quickly identify different source code components. In RPerl, all variable names begin the $ sigil, both scalar types and aggregate structures alike.

RPerl provides 7 scalar data types:

Of the 7 RPerl scalar data types, 3 are directly (natively) supported by the Perl 5 core: integer, number, and string. This means the Perl 5 core is capable of directly identifying and storing those 3 core types. The remaining 4 non-core types are indirectly supported by the Perl 5 interpreter: boolean and unsigned_integer can be stored within either an integer or number; character can be stored within a string; and gmp_integer is supported by the use bigint; wrapper around the Math::BigInt::GMP module.

When RPerl application source code is compiled from RPerl into C++, all 7 data types are natively supported by C++ for high-speed execution.

A single group of actual numeric digit(s) or quoted string character(s) is called a "literal", such as:

    -21         # integer or gmp_integer or number

    'howdy'     # string

    -23.421_12  # number

    1_234_567   # unsigned_integer or integer or gmp_integer or number

    1_234_567_890_123_456_789_012_345_678_901_234_567_890_123_456_789_012_345  # gmp_integer

    'One million, two-hundred-thirty-four thousand, five-hundred-sixty-seven'  # string

    '1'         # character or string

    'a'         # character or string

    "\n"        # newline character or string

    q{}         # empty character or string

    0           # boolean or unsigned_integer or integer or gmp_integer or number

Section 2.1: Numeric Data & Operators

RPerl provides 5 numeric data types:

Perl 5 provides several "built-in operators" designed for use with numeric data, which can be organized into 6 general categories:

Most of the operators which have names consisting of normal letters (a - z) are classified as "functions" in Perl 5 terminology. Notable exceptions are the logic operators, which are simply classified as "operators" in Perl, along with most of the operators which have names consisting of special characters. For the sake of simplicity, we will only use the term "operator" throughout this textbook.

http://perldoc.perl.org/perlop.html

http://perldoc.perl.org/perlfunc.html

Each operator in Perl 5 (and thus RPerl) is assigned 4 important characteristics: "arity" (a number), "fixity" (a placement location), "precedence" (a number) and "associativity" (a chirality or "handedness"). Operators of unary arity accept exactly 1 input operand, binary operators accept exactly 2 operands, etc. Prefix operators appear before their respective operands, postfix appear after, infix appear between, and closed operators appear both before and after their operands. Operators with a lower numeric precedence are executed before operators with a higher precedence; in the absence of parentheses, multiplication executes before addition because multiplication has a lower precedence number. Operators with equal precedence number are grouped by (and executed in order of) associativity; in the absence of parentheses, multiple subtraction operators will execute from left to right because subtraction is left-associative, whereas multiple exponent operators will execute from right to left because exponentiation is right-associative. For more information, see the Appendix:

"D.3: Syntax Arity, Fixity, Precedence, Associativity"

Beyond the built-in math operators in Perl 5, more advanced operators and functions are available via the MathPerl software suite, which is (perhaps unsurprisingly) optimized using the RPerl compiler.

MathPerl on CPAN

Section 2.1.1: Boolean Literals

The most memory-efficient numeric literal is boolean, which represents a single "bit" (binary digit) of information. A boolean literal may only give the values of exactly 0 or 1.

    0     # boolean
    1     # boolean
    -1    # not a boolean
    1.5   # not a boolean
    -1.5  # not a boolean

Section 2.1.2: Unsigned Integer Literals

The second most efficient numeric literal is unsigned_integer, which represents a single whole (non-decimal) number greater-than or equal to 0. An unsigned_integer literal may describe any positive whole number, within the data size limits of the data types supported by your operating system software and computer hardware. An unsigned_integer may not describe a negative number or a non-whole number.

    23      # unsigned_integer
    0       # unsigned_integer
    42_230  # unsigned_integer
    -23     # not an unsigned_integer
    42.1    # not an unsigned_integer
    999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999  # bad unsigned_integer, outside data type limits

Section 2.1.3: Integer Literals

The third most efficient numeric literal is integer, which represents a single whole (non-decimal) number. An integer literal may describe any positive or negative whole number, within your operating system and hardware data type limits.

    -23     # integer
    0       # integer
    42_230  # integer
    42.1    # not an integer
    -999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999  # bad integer, outside data type limits

Section 2.1.4: GMP Integer Literals

The GNU Multi-Precision (GMP) software library is utilized to provide the gmp_integer numeric literal, representing a single whole (non-decimal) number which may safely exceed the data type limits of your operating system and hardware. A gmp_integer literal may describe any positive or negative whole number, within the limits of the memory (real or virtual) available to your RPerl code.

    -23     # gmp_integer
    0       # gmp_integer
    42_230  # gmp_integer
    42.1    # not a gmp_integer
    -999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999  # gmp_integer

Section 2.1.5: Number Literals

The number numeric literal represents a single floating-point (decimal) number, and may express any real number within your computer's data type limits.

    -23.42     # number
    0.000_001  # number
    42.23      # number
    42         # number
    -4_123.456_789_123_456_789_123_456_789_123_456_789_123_456_789_123_456_789_123_456  # bad number, outside data type limits

Section 2.1.6: Underscore Digit Separators

For unsigned_integer, integer, gmp_integer, and number literals, an "underscore" _ character must be inserted after every third digit away from the decimal point, where the underscore is used in a similar manner as a comma when writing long numbers by hand.

    1_234_567  # integer, same as "1,234,567" in American notation
    -32_123    # integer, same as "-32,123" in American notation
    -32123     # bad integer, missing underscore

    1.234_567           # number, same as "1.234567" in American notation
    -32_123.456_789_01  # number, same as "-32,123.45678901" in American notation
    -32_123.456_78901   # bad number, missing underscore

Section 2.1.7: Optional Positive Sign

For unsigned_integer, integer, gmp_integer, and number literals, an optional + plus sign may be prepended to explicitly indicate a numeric literal is positive (greater-than zero).

    1   # positive one
    +1  # also positive one

BEST PRACTICES

  • When only positive numeric literals are used in one area of code, omit positive signs.
  • When both positive and negative literals are used in one code area, use signs for all applicable literals.
    +23    # NOT BEST PRACTICE: not aligned with other unsigned literal below
    +55.6  # NOT BEST PRACTICE: not aligned with other unsigned literal below
    42

    23     #     BEST PRACTICE:     aligned with other unsigned literal below, best for all-positive literals
    55.6   #     BEST PRACTICE:     aligned with other unsigned literal below, best for all-positive literals
    42

    23     # NOT BEST PRACTICE:      not aligned with other signed literals below
    55.6   # NOT BEST PRACTICE:      not aligned with other signed literals below
    -21
    -66.5

     23    # NOT BEST PRACTICE: manually aligned with other signed literals below, but will not automatically align via Perl::Tidy
     55.6  # NOT BEST PRACTICE: manually aligned with other signed literals below, but will not automatically align via Perl::Tidy
    -21
    -66.5

    +23    #     BEST PRACTICE:          aligned with other signed literals below, best for mixed-sign literals
    +55.6  #     BEST PRACTICE:          aligned with other signed literals below, best for mixed-sign literals
    -21
    -66.5

Section 2.1.8: Scientific Notation

For unsigned_integer, integer, and number literals, very large or very small numbers may be approximated using "scientific notation", where each number is normalized to have exactly one digit to the left of the decimal point, then a lower-case e character and an appropriate integer power-of-ten is appended to the resulting normalized floating-point number. The e character stands for "exponent", as in "exponent of ten", and the Perl style of scientific notation is sometimes more accurately referred to as "scientific e notation".

As with normal integers, negative exponents must be prefixed with a - minus sign and positive exponents may be optionally prefixed with a + plus sign.

    1_234_567_000     # good integer
    1.234_567_000e09  # good number, same as "1_234_567_000" in scientific notation

    0.001_234_567_000  # good number
    1.234_567_000e-03  # good number, same as "0.001_234_567_000" in scientific notation

    -0.000_000_000_000_000_000_000_001_234_567  # bad number, outside data type limits
    -1.234_567e-24  # good number, same as "-0.000_000_000_000_000_000_000_001_234_567" in scientific notation

BEST PRACTICES

  • Use 2 digits to represent all exponents.
  • When only positive exponents are used, omit exponent signs.
  • When both positive and negative exponents are used, use signs for all exponents.
    1_234_567_000      # NOT BEST PRACTICE: no exponent
    1.234_567_000e9    # NOT BEST PRACTICE: not aligned with two-digit exponents below
    1.234_567_000e+09  # NOT BEST PRACTICE: not aligned with two-digit exponents below
    1.234_567_000e09   #     BEST PRACTICE:     aligned with two-digit exponents below, best for all-positive exponents
    1.234_567_000e19
    2.111_000_333e04

    1.234_567_000e09   # NOT BEST PRACTICE: not aligned with signed exponents below
    1.234_567_000e+09  #     BEST PRACTICE:     aligned with signed exponents below, best for mixed-sign exponents
    1.234_567_000e-09
    2.111_000_333e-04

    # accuracy of following numbers may be reduced on computers with lower precision data types
    +1.537_969_711_485_091_65e+21  # BEST PRACTICE: best for mixed-sign exponents
    -2.591_931_460_998_796_41e+01  # BEST PRACTICE: best for mixed-sign exponents
    +1.792_587_729_503_711_81e-01  # BEST PRACTICE: best for mixed-sign exponents
    +2.680_677_724_903_893_22e-03  # BEST PRACTICE: best for mixed-sign exponents
    +1.628_241_700_382_422_95e-03  # BEST PRACTICE: best for mixed-sign exponents
    -9.515_922_545_197_158_70e-15  # BEST PRACTICE: best for mixed-sign exponents

Section 2.1.9: Truth Values

In the most simple case, a "truth value" may be represented by a boolean literal where the numeric value of 0 represents the truth value of "false", and numeric 1 represents "true". In general, a truth value is any data which may be recognized by Perl (and thus RPerl) as being either true or false; there is no third option.

Perl recognizes relatively few values as false, of which only 4 are accepted by RPerl:

      0   # false, number zero
     '0'  # false, text   zero
    q{0}  # false, text   zero
    q{}   # false, text   empty

All other values which RPerl accepts are recognized to hold a truth value of true.

All numeric operators in the comparison and logic sub-categories, as well as all string operators in the comparison sub-category, will generate truth values as output.

Perl attaches magic to the truth value of false, allowing it be utilized as either a normal truth value, or a numeric value of 0, or an empty string value of q{}. This magic behavior is not supported by C++, and thus not supported by RPerl.

In C and C++, only numeric 0 is universally recognized as false, while the text 0 and empty text are recognized as true, which is different than Perl. To achieve compatibility, RPerl automatically inserts additional C++ logic in all compiled output code to check for the 2 remaining RPerl false values of text character zero '0' or q{0}, and empty text q{}. This ensures the compiled C++ output code will behave identically to the original RPerl input source code, with regard to truth values.

WARNING FOR ALL COMPARISON & LOGIC OPERATORS:

Due to Perl's magic attached to truth values of false, as well as the difference between Perl and C++ recognized truth values, you may experience unexpected or undefined behavior if a truth value is utilized anywhere except true-or-false conditions in loops and conditional statements.

Only utilize the truth values returned by comparison and logic operators within the condition enclosed by parentheses in if (), elsif (), for (), or while ().

    if    (1 > 2)     { print 'I think not', "\n"; }  # good use of greater-than operator
    elsif ($x and $y) { print 'Maybe',       "\n"; }  # good use of          and operator

    for   (my integer $i = 0; $i < 23; $i++) { print 'finite loop',   "\n"; }  # good use of less-than operator
    while (                    1 != 2      ) { print 'infinite loop', "\n"; }  # good use of not-equal operator

    my integer $foo = 3 + (1 >= 2);   # UNEXPECTED BEHAVIOR: bad use of greater-than-or-equal operator
    my integer $bar = 3 * (1 <= 2);   # UNEXPECTED BEHAVIOR: bad use of    less-than-or-equal operator
    my integer $bat = sin (1 == 2);   # UNEXPECTED BEHAVIOR: bad use of                 equal operator

Section 2.1.10: Floating-Point Error

A "floating-point number" is any number which includes a decimal point . as part of the numeric representation, as opposed to an integer which does not include a decimal point. In RPerl, all floating-point values are stored in variables of the data type number.

       0  # integer
       1  # integer
    -123  # integer
    
       0.1    # floating-point
       1.654  # floating-point
    -123.4    # floating-point
    
    my integer $some_int   = 23;     # integer variable
    my number  $some_float = 23.42;  # floating-point variable

All computer languages which perform calculations on floating-point numbers are susceptible to "floating-point error", which is any inaccuracy due to incorrect rounding or lack of available precision.

For example, suppose you have a floating-point number 0.105, which seems normal enough. Let us further suppose you want to divide 0.105 by some other number, say 1_000, and by the simple rules of arithmetic you would expect to arrive at an answer of 0.000_105. Unfortunately, due to the way floating-point numbers are stored in computer memory, the actual result is closer to 0.000_104_999, so 0.000_105 happens to be one of many possible floating-point errors.

           (0.105 / 1_000) == 0.000_105    # UNEXPECTED BEHAVIOR: false
    print ((0.105 / 1_000) -  0.000_105);  # UNEXPECTED BEHAVIOR: -1.355_252_715_606_88e-20

It will usually not be possible to easily predict when and where floating-point error will occur. In general, you may experience the effects of floating-point error whenever your code relies upon one or more of the following:

To compensate for unpredictable floating-point error, you should use the "floating-point epsilon" value stored in the constant RPerl::EPSILON(), which is a very small number used to help detect inaccuracies. Whenever you want of directly compare two floating-point values, instead use the subtraction - and absolute value abs operators to take the positive difference, then use the less-than < operator to compare the difference to the floating-point epsilon value. If the difference is less-than the floating-point epsilon, then the two input floating-point values can be considered to be numerically equal.

                                         RPerl::EPSILON()  # A VERY SMALL NUMBER: 0.000_000_000_000_000_2
         (0.105 / 1_000) == 0.000_105                      # UNEXPECTED BEHAVIOR: false
    abs ((0.105 / 1_000) -  0.000_105) < RPerl::EPSILON()  #   EXPECTED BEHAVIOR: true

    my number $foo = 0.105 / 1_000;
    my number $faa = 0.105 / 1_000;
    my number $bar = 0.000_105;
    my number $bat = 0.000_105;
         if (0.000_105 == 0.000_105)             { print 'true'; } else { print 'false'; }  #   EXPECTED BEHAVIOR: true
         if ((0.105 / 1_000) == (0.105 / 1_000)) { print 'true'; } else { print 'false'; }  #   EXPECTED BEHAVIOR: true
         if ($bar == $bat)                       { print 'true'; } else { print 'false'; }  #   EXPECTED BEHAVIOR: true
         if ($foo == $faa)                       { print 'true'; } else { print 'false'; }  #   EXPECTED BEHAVIOR: true
         if ($foo == $bar)                       { print 'true'; } else { print 'false'; }  # UNEXPECTED BEHAVIOR: false
    if (abs ($foo -  $bar) < RPerl::EPSILON())   { print 'true'; } else { print 'false'; }  #   EXPECTED BEHAVIOR: true

WARNING FOR ALL FLOATING-POINT NUMERIC OPERATORS:

Due to floating-point error, unexpected behavior may be experienced if a floating-point value is tested for exact equality with any other value.

Always use the floating-point epsilon value RPerl::EPSILON() to check for approximate equality instead of exact equality.

Section 2.1.11: Arithmetic Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Absolute Value

abs

Unary

Prefix

01

Left

Coming Soon

Natural Exponential Function

exp

Unary

Prefix

01

Non

Coming Soon

Exponent AKA Power

**

Binary

Infix

04

Right

Yes

Negative with Parentheses

-( )

Unary

Closed

05

Right

Yes

Multiply

*

Binary

Infix

07

Left

Yes

Divide

/

Binary

Infix

07

Left

Yes

Modulo AKA Modulus

%

Binary

Infix

07

Left

Yes

Add

+

Binary

Infix

08

Left

Yes

Subtract

-

Binary

Infix

08

Left

Yes

Natural Logarithm

log

Unary

Prefix

10

Non

Coming Soon

Square Root

sqrt

Unary

Prefix

10

Non

Coming Soon

Section 2.1.12: Trigonometry Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Arctangent-Divide

atan2

Binary

Prefix

01

Left

Coming Soon

Sine

sin

Unary

Prefix

10

Non

Coming Soon

Cosine

cos

Unary

Prefix

10

Non

Coming Soon

Section 2.1.13: Comparison (Relational & Equality) Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Less-Than

<

Binary

Infix

11

Non

Yes

Greater-Than

>

Binary

Infix

11

Non

Yes

Less-Than-Or-Equal

<=

Binary

Infix

11

Non

Yes

Greater-Than-Or-Equal

>=

Binary

Infix

11

Non

Yes

Equal

==

Binary

Infix

12

Non

Yes

Not-Equal

!=

Binary

Infix

12

Non

Yes

Three-Way Comparison AKA Spaceship

<=>

Binary

Infix

12

Non

Coming Soon

Section 2.1.14: Logic Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Logical Negation

!

Unary

Prefix

05

Right

Yes

Logical And

&&

Binary

Infix

15

Left

Yes

Logical Or

||

Binary

Infix

16

Left

Yes

Logical Negation

not

Unary

Prefix

22

Right

Yes

Logical And

and

Binary

Infix

23

Left

Yes

Logical Or

or

Binary

Infix

24

Left

Yes

Logical Xor

xor

Binary

Infix

24

Left

Yes

Section 2.1.15: Bitwise Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Bitwise Negation with Parentheses

~( )

Unary

Closed

05

Right

Yes

Bitwise Shift Left

<<

Binary

Infix

09

Left

Coming Soon

Bitwise Shift Right

>>

Binary

Infix

09

Left

Coming Soon

Bitwise And

&

Binary

Infix

13

Left

Coming Soon

Bitwise Or

|

Binary

Infix

14

Left

Coming Soon

Bitwise Xor

^

Binary

Infix

14

Left

Coming Soon

All code examples are shown using 64-bit integers; your results may vary depending upon your C++ compiler and hardware platform.

WARNING FOR ALL BITWISE OPERATORS:

Due to the difference between how integer and non-integer scalar values are stored in memory, you may experience unexpected or undefined behavior if a bitwise operator is passed any non-integer operands as input.

Only utilize bitwise operators with integer, unsigned_integer, or boolean operands.

    ~(1)      # good use of bitwise negation    operator
    12 << 2   # good use of bitwise shift left  operator
    13 >> 3   # good use of bitwise shift right operator

    my unsigned_integer $foo = 14;
    $foo & 4  # good use of bitwise and operator
    $foo | 5  # good use of bitwise  or operator
    $foo ^ 6  # good use of bitwise xor operator

    my number $bar = 21.12;
    ~($bar)         # UNEXPECTED BEHAVIOR: bad use of bitwise negation    operator
      $bar <<    7  # UNEXPECTED BEHAVIOR: bad use of bitwise shift left  operator
         8 >> $bar  # UNEXPECTED BEHAVIOR: bad use of bitwise shift right operator

    23.42 &   9   # UNEXPECTED BEHAVIOR: bad use of bitwise and operator
    23    ^ 1.1   # UNEXPECTED BEHAVIOR: bad use of bitwise xor operator

Also, due to the difference between how signed and unsigned integer values are stored in memory, both Perl's use integer; pragma configuration command as well as RPerl's integer and unsigned_integer data types must be utilized in correct combination, as described below. Perl relies on the use integer; pragma to determine numeric data types, while C++ relies on the actual data types provided by the software developers as part of each variable declaration. You may experience unexpected behavior if a bitwise operator is utilized with mismatching pragma and data types.

Each call to Perl's use integer; pragma applies to 1 entire RPerl source code file at a time, either 1 full *.pl program file or 1 full *.pm module file. If the use integer; pragma is in effect for a specific RPerl file, then all bitwise operators in said file must be provided with input operands which are signed (positive or negative) integers only, not unsigned integers or floating-point numbers or other data types. If use integer; is not in effect for an RPerl file, then all bitwise operators in said file must be provided with operands which are unsigned (non-negative) integers or booleans only, not signed integers or other data types.

The use integer; pragma also affects the results of all arithmetic and comparison operators by discarding the non-integer (fractional) portion of input operands, as if the floor operator were called to convert each floating-point number into an integer. Because of this, floating-point operators and signed integer bitwise operators must not be included in the same single RPerl source code file; to compensate, simply move all signed bitwise operators into their own separate RPerl *.pm module file which calls use integer;.

Please see the Perl documentation for more information about the use integer; pragma:

http://perldoc.perl.org/integer.html

    # good combination: no 'use integer;' pragma, 
    # unsigned_integer data types for bitwise negation operator, number data type for multiplication operation
    my unsigned_integer $foo = 5;
    my unsigned_integer $bar = ~($foo);  # $bar  = 18_446_744_073_709_551_610
    my number $quux = 21.12 * 42.23;     # $quux = 891.897_6

    # good combination: 'use integer;' pragma,
    # integer data types for both bitwise negation operator and multiplication operator
    use integer;
    my integer $foo = 5;
    my integer $bar = ~($foo);   # $bar  = -6
    my integer $quux = 21 * 42;  # $quux =  882

    # bad combination: no 'use integer;' pragma, 
    # integer data types for bitwise negation operator
    my integer $foo = 5;
    my integer $bar = ~($foo);   # UNEXPECTED BEHAVIOR: ($bar = 18_446_744_073_709_551_610) in Perl, but ($bar = -6) in C++

    # bad combination: 'use integer;' pragma,
    # unsigned_integer data types for bitwise negation operator, number data type for multiplication operator
    use integer;
    my unsigned_integer $foo = 5;
    my unsigned_integer $bar = ~($foo);  # UNEXPECTED BEHAVIOR: ($bar = -6) in Perl, but ($bar = 18_446_744_073_709_551_610) in C++
    my number $quux = 21.12 * 42.23;     # UNEXPECTED BEHAVIOR: ($quux = 882) in Perl, but ($quux = 891.897_6) in C++

Section 2.1.16: Miscellaneous Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Floor with Parentheses

POSIX::floor( )

Unary

Closed

01

Left

Coming Soon

Ceiling with Parentheses

POSIX::ceil( )

Unary

Closed

01

Left

Coming Soon

Separate Integer & Fraction Parts with Parentheses

POSIX::modf( )

Binary

Closed

01

Left

Coming Soon

Integer Part

int

Unary

Prefix

10

Non

Coming Soon

Random Number

rand

Nullary or Unary

Prefix

10

Non

Coming Soon

Random Seed

srand

Nullary or Unary

Prefix

10

Non

Coming Soon

Section 2.2: Text Data & Operators

RPerl provides 2 text data types:

RPerl provides 3 delimiters for enclosing text data:

Perl 5 provides several built-in operators designed for use with text data, which can be organized into 7 general categories:

Section 2.2.1: Character Literals

The most memory-efficient text literal is character, which represents exactly zero characters or one character of information. A character may express the value of any single numeric digit (0, 1, 2, ..., 8, 9); letter (a, b, c, ..., y, z, A, B, C, ..., Y, Z ); or special ASCII character (!, #, *, +, etc). If the character literal has length zero, meaning it represents zero characters of information, then it is called the "empty character" and contains no data.

    ''           # INVALID: use q{} for empty character
    '0'          #   VALID
    'h'          #   VALID
    '+'          #   VALID
    '\n'         # INVALID: too many characters, use "\n" for newline character
    '-1'         # INVALID: too many characters
    'howdy23!'   # INVALID: too many characters

    ""           # INVALID: use q{} for empty character
    "0"          #   VALID
    "h"          #   VALID
    "+"          #   VALID
    "\n"         #   VALID: newline
    "-1"         # INVALID: too many characters & invalid use of double-quotes
    "howdy23!"   # INVALID: too many characters & invalid use of double-quotes

    q{}          #   VALID: empty
    q{0}         #   VALID
    q{h}         #   VALID
    q{+}         #   VALID
    q{\n}        # INVALID: too many characters, use "\n" for newline character
    q{-1}        # INVALID: too many characters
    q{howdy23!}  # INVALID: too many characters

Section 2.2.2: String Literals

Any text data more than 1 character in length must be represented by a string literal, which is comprised of any combination of valid character literal characters (numeric digits, letters, and special ASCII characters). Like the empty character, if a string literal has length zero then it is called the "empty string" and contains no data.

    ''           # INVALID: use q{} for empty string
    '0'          #   VALID
    'h'          #   VALID
    '+'          #   VALID
    '\n'         # INVALID: not a newline, use "\n" for string containing newline
    '\\n'        #   VALID: interpolated to become two characters (backslash, letter n)
    '-1'         #   VALID
    'howdy23!'   #   VALID

    ""           # INVALID: use q{} for empty string
    "0"          # INVALID: invalid use of double-quotes, must contain newline or tab character(s)
    "h"          # INVALID: invalid use of double-quotes, must contain newline or tab character(s)
    "+"          # INVALID: invalid use of double-quotes, must contain newline or tab character(s)
    "\n"         #   VALID: interpolated to become newline character
    "-1"         # INVALID: invalid use of double-quotes, must contain newline or tab character(s)
    "howdy23!"   # INVALID: invalid use of double-quotes, must contain newline or tab character(s)

    q{}          #   VALID: empty string
    q{0}         #   VALID
    q{h}         #   VALID
    q{+}         #   VALID
    q{\n}        # INVALID: not a newline, use "\n" for string containing newline
    q{\\n}       #   VALID: interpolated to become two characters (backslash, letter n)
    q{-1}        #   VALID
    q{howdy23!}  #   VALID

Section 2.2.3: Single-Quotes

Text literals enclosed in single-quotes are the simplest and most common case in RPerl.

Single-quoted text literals are not "interpolated", which means the literal's data contents are not changed by Perl or RPerl in any way, except for the extra-special double-backslash \\ as described below. Because single-quotes do not activate string interpolation, you can not use a single-quoted string literal to represent special characters such as newline or tab.

Do not use single-quotes to represent a newline or tab character, use double-quotes "\n" or "\t" instead.

Do not use single-quotes to represent an empty character or empty string, use q-quotes q{} instead.

In normal Perl, single-backslash characters \ are used to create special characters called "escape sequences", the most common of which are the well-known newline \n and tab \t escape sequences. Each valid escape sequence actually counts as only one character of computer data, even though it is represented to humans by 2 or more typed characters, so a single escape sequence may be utilized as either a character text literal or a string text literal. (Thus, "\n" and "\t" may both be utilized as either a character or string in RPerl, but only when using double-quotes as discussed in the following section.)

In single-quoted string literals, the only escape sequence supported by RPerl is the double-backslash \\, so the string literal '\\' is interpolated to mean only one single-backslash character \. To represent two backslash characters \\, utilize two double-backslash escape sequences in a row '\\\\'; for three backslashes \\\ utilize three escape sequences '\\\\\\', and so forth.

Because RPerl only accepts double-backslashes (not single-backslashes) within single-quotes, RPerl thus does not accept any odd number of consecutive backslash characters within single-quotes. For example, two or four backslashes in a row are supported, but one or three or five directly adjacent backslashes are not supported.

In normal Perl, the backslash single-quote \' escape sequence may be used to include a single-quote character ' within a single-quoted text literal. As stated above, RPerl only supports the double-backslash \\ escape sequence within single-quotes, so the backslash single-quote \' escape sequence is thus not supported. Use double-quotes "'" or q-quotes q{'} to represent a single-quote ' character in an RPerl string literal.

Single-quoted text literals must not contain:

Single-quoted text literals may contain:

    ''      # INVALID: empty string
    ' '     #   VALID: single space
    'a'     #   VALID: single letter
    '\'     # INVALID: single-backslash

    '   '   #   VALID: three spaces
    ' ' '   # INVALID: single-quote within single-quotes
    '\' '   # INVALID: backslash single-quote escape sequence, also single-backslash

    ' a '   #   VALID: space, letter, space
    '"a}'   #   VALID: double-quote, letter, right brace
    '\a}'   # INVALID: single-backslash       not interpolated as audible bell (alarm beep) escape sequence
    '\n}'   # INVALID: single-backslash       not interpolated as newline                   escape sequence
    '\\}'   #   VALID: double-backslash           interpolated as one backslash character, right brace
    '\\\'   # INVALID: odd  number of backslashes 
    '\\\\'  #   VALID: even number of backslashes interpolated as half as many backslash characters

BEST PRACTICES

  • Use single-quoted text literals whenever possible.
    'n'      #     BEST PRACTICE
    "n"      # NOT BEST PRACTICE: double-quotes not needed
    q{n}     # NOT BEST PRACTICE:      q-quotes not needed

    '1atx'   #     BEST PRACTICE
    "1atx"   # NOT BEST PRACTICE: double-quotes not needed
    q{1atx}  # NOT BEST PRACTICE:      q-quotes not needed

    '\\tx'   #     BEST PRACTICE
    "\\tx"   # NOT BEST PRACTICE: double-quotes    invalid, extra backslash
    q{\\tx}  # NOT BEST PRACTICE:      q-quotes not needed

Section 2.2.4: Double-Quotes

Text literals enclosed in double-quotes are fully interpolated in normal Perl, and are only used for trivial interpolation of strings containing the newline "\n" or tab "\t" escape sequences in RPerl. All double-quoted strings in RPerl must contain at least one newline or tab special character.

In addition to escape sequences, string interpolation in normal Perl is also triggered by finding either the dollar-sign $ or "at-sign" @ characters inside of a double-quoted string literal. Because RPerl does not support string interpolation, double-quoted string literals must not contain the $ or @ characters.

Double-quoted string literals must not contain any backslash characters, other than those used in newline \n and tab \t escape sequences, and thus can not represent a single-backslash character \; use single-quotes '\\' or q-quotes q{\\} double-backslash escape sequences instead.

As with single-quotes, in normal Perl the backslash double-quote \" escape sequence may be used to include a double-quote character " within a double-quoted text literal. As stated above, RPerl only supports the newline \n and tab \t escape sequences within double-quotes, so the backslash double-quote \" escape sequence is thus not supported. Use single-quotes '"' or q-quotes q{"} to represent a double-quote " character in an RPerl string literal.

Double-quoted text literals must not contain:

Double-quoted text literals must contain 1 or more:

Double-quoted text literals may contain:

    ""      # INVALID: empty string
    " "     #   VALID: single space
    "a"     #   VALID: single letter
    "\"     # INVALID: extra  backslash,                                                                     only \n and \t supported

    "   "   #   VALID: three spaces
    " " "   # INVALID: double-quote within double-quotes
    "\" "   # INVALID: backslash double-quote escape sequence, also single-backslash

    " a "   #   VALID: space, letter, space
    "'a}"   #   VALID: single-quote, letter, right brace
    "\a}"   # INVALID: single-backslash           interpolated as audible bell (alarm beep) escape sequence, only \n and \t supported
    "\n}"   #   VALID: single-backslash           interpolated as newline                   escape sequence, right brace
    "\\}"   # INVALID: extra  backslashes,                                                                   only \n and \t supported
    "\\\"   # INVALID: extra  backslashes,                                                                   only \n and \t supported 
    "\\\\"  # INVALID: extra  backslashes,                                                                   only \n and \t supported

BEST PRACTICES

  • Use double-quoted text literals to contain newline \n and tab \t characters only, not other normal characters.
  • To represent a mixture of normal characters with newline and/or tab characters, enclose the normal characters in single-quotes, enclose the newline and tab characters in double-quotes, and use the dot . "string concatenation" operator to append one string literal to the other. (Please see "Section 2.2.6: Editing Operators" for more information about string concatenation.)
    "\n"        #     BEST PRACTICE:     newline         only
    "\t"        #     BEST PRACTICE:                 tab only
    "\t\n\t"    #     BEST PRACTICE:     newline and tab only

    "a\n"       # NOT BEST PRACTICE: not newline and tab only
    'a' . "\n"  #     BEST PRACTICE:     newline         only, additional characters in single-quotes

    "\tx"       # NOT BEST PRACTICE: not newline and tab only
    "\t" . 'x'  #     BEST PRACTICE:                 tab only, additional characters in single-quotes

    "a\tx\n"                 # NOT BEST PRACTICE: not newline and tab only
    'a' . "\t" . 'x' . "\n"  #     BEST PRACTICE:     newline and tab only, additional characters in single-quotes

Section 2.2.5: q Quotes

Text literals enclosed in "q-quotes" begin with lowercase letter q and left "curly-brace" characters q{, and end with the right curly-brace } character. You must use q-quotes to represent empty text q{} literals, which contain no characters. Curly braces are also known as "curly-brackets" or just "braces" for short.

Normal Perl supports q-quoted string literals using delimiters other than curly-braces, as well as "qq-quotes" which provide string interpolation in the same way as double-quoted strings. RPerl's existing string quoting mechanisms cover all non-interpolated use cases, so RPerl does not support the additional qq-quotes or non-curly-brace q-quotes, because TDNNTBMTOWTDI.

q-quoted literals behave exactly the same as single-quoted literals, other than the empty string q{} and the difference in delimiters.

q-quoted text literals must not contain:

q-quoted text literals may contain:

    q{}      #   VALID: empty string
    q{ }     #   VALID: single space
    q{a}     #   VALID: single letter
    q{\}     # INVALID: single-backslash

    q{   }   #   VALID: three spaces
    q{ } }   # INVALID: right brace within q-quotes
    q{\} }   # INVALID: backslash right brace escape sequence, also single-backslash

    q{ a }   #   VALID: space, letter, space
    q{"a'}   #   VALID: double-quote, letter, single-quote
    q{\a'}   # INVALID: single-backslash       not interpolated as audible bell (alarm beep) escape sequence
    q{\n'}   # INVALID: single-backslash       not interpolated as newline                   escape sequence
    q{\\'}   #   VALID: double-backslash           interpolated as one backslash character, single-quote
    q{\\\}   # INVALID: odd  number of backslashes 
    q{\\\\}  #   VALID: even number of backslashes interpolated as half as many backslash characters

BEST PRACTICES

  • Use q-quoted text literals to represent empty text literals only.
    ''      # NOT BEST PRACTICE: single-quotes invalid, empty string
    ""      # NOT BEST PRACTICE: double-quotes invalid, empty string
    q{}     #     BEST PRACTICE

    '0gnb'   #     BEST PRACTICE
    "0gnb"   # NOT BEST PRACTICE: double-quotes not needed
    q{0gnb}  # NOT BEST PRACTICE:      q-quotes not needed

    '\\nx'   #     BEST PRACTICE
    "\\nx"   # NOT BEST PRACTICE: double-quotes    invalid, extra backslash
    q{\\nx}  # NOT BEST PRACTICE:      q-quotes not needed

Section 2.2.6: Editing Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Substring

substr

Variadic

Prefix

01

Left

Coming Soon

Repeat

x

Binary

Infix

07

Left

Coming Soon

Concatenate

.

Binary

Infix

08

Left

Yes

Length

length

Unary

Prefix

10

Non

Coming Soon

Section 2.2.7: Case Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Lowercase

lc

Unary

Prefix

10

Non

Coming Soon

Lowercase First Character

lcfirst

Unary

Prefix

10

Non

Coming Soon

Uppercase

uc

Unary

Prefix

10

Non

Coming Soon

Uppercase First Character

ucfirst

Unary

Prefix

10

Non

Coming Soon

Section 2.2.8: Comparison (Relational & Equality) Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Less-Than

lt

Binary

Infix

11

Non

Coming Soon

Greater-Than

gt

Binary

Infix

11

Non

Coming Soon

Less-Than-Or-Equal

le

Binary

Infix

11

Non

Coming Soon

Greater-Than-Or-Equal

ge

Binary

Infix

11

Non

Coming Soon

Equal

eq

Binary

Infix

12

Non

Coming Soon

Not Equal

ne

Binary

Infix

12

Non

Coming Soon

Three-Way Comparison

cmp

Binary

Infix

12

Non

Coming Soon

Section 2.2.9: Search Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Index

index

Variadic

Prefix

01

Left

Coming Soon

Reverse Index

rindex

Variadic

Prefix

01

Left

Coming Soon

Section 2.2.10: Formatting Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

String Formatted Print

sprintf

Variadic

Prefix

01

Left

Coming Soon

Quote Meta

quotemeta

Unary

Prefix

10

Non

Coming Soon

Section 2.2.11: Base Conversion Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Chr

chr

Unary

Prefix

01

Left

Coming Soon

Hex

hex

Unary

Prefix

10

Non

Coming Soon

Oct

oct

Unary

Prefix

10

Non

Coming Soon

Ord

ord

Unary

Prefix

10

Non

Coming Soon

Section 2.2.12: Miscellaneous Operators

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Crypt

crypt

Binary

Prefix

01

Left

Coming Soon

Section 2.3: RPerl´s Phases, Warnings & Errors

We will now take a break from numbers and strings and operators, in order to investigate the various stages of installing and running RPerl, as well as the most common issues you may encounter.

Like normal Perl, RPerl tries to generate helpful messages when something does not go as planned.

A "warning" occurs when something unexpected happens, but RPerl can continue on without being forced to end prematurely.

An "error" occurs when something unexpected happens, and RPerl must end immediately.

We will collectively refer to both warnings and errors as "problems", and we will refer to both warning messages and error messages as "problem messages".

There are 15 RPerl "phases", most of which are (hopefully unsurprisingly) termed as "compile phases". An RPerl phase is simply a distinct category of work to be done by RPerl or some other related software, and a compile phase is a category of work specifically related to compiling RPerl application source code.

All warnings and errors in RPerl fall within exactly 1 phase; technically, only those phases shown in bold contain problem messages generated by RPerl itself:

Section 2.3.1: Install, Base Dependencies

The first step toward using RPerl is installing RPerl, and the first step of installing RPerl is making sure you have a working copy of the Perl interpreter and a C++ compiler, as well as a few other base dependencies such as libc, libperl, ExtUtils::MakeMaker, GMP, cURL, and AStyle.

A "dependency" is one piece of software which another piece of software requires for proper functionality. One software component is thus said to "depend" upon a second software component, without which the first component would not work correctly. A "subdependency" is a dependency of a dependency. Each piece of software may require any number of dependencies, each of which may require any number of subdependencies, and so forth until all required software components have been loaded. For the purposes of the discussions in this book, the term "dependency" will inclusively refer to all of a specific software component's dependencies, subdependencies, sub-subdependencies, etc.

A "base dependency" is a software component required for RPerl to function, and is differentiated from a "CPAN dependency" because base dependencies are installed manually or via the operating system package manager instead of via the CPAN Perl software network, as detailed in the next section. If you have a modern operating system, you may already have most-or-all of these base dependencies either pre-installed or available for easy automatic install via pre-built packages; if not, you may need to manually install one or more of them. For more information on this phase of installation, please see:

"Section 1.16: How Can I Download & Install RPerl?"

Unique challenges may be posed by every different combination of operating system and hardware platform, because there are many different versions (and even variants) of each base dependency, with no way to predict your computer's behavior ahead of time. Also, the RPerl development team has little-to-no interaction with the developers of the base dependencies, so it is impossible to provide useful documentation of the countless possible problem messages which you may encounter in this phase.

Some common types of problem messages may include:

If you encounter any problems during this step, please consult your operating system's documentation or see:

"Section 1.18: How Can I Obtain Technical Support For RPerl?"

Section 2.3.2: Install, CPAN Dependencies

Perhaps the most complex part of installing RPerl is comprised of finding, building, testing, and installing all of the software needed for RPerl's dependencies (and subdependencies) via CPAN. A "CPAN dependency" is an RPerl dependency which is installed via the CPAN software network. Future versions of RPerl will provide easier pre-built installation packages for popular operating systems, such as those used in the base dependencies phase, but for now all (sub)dependencies must be downloaded from CPAN and custom-built for your operating system. Each individual piece of software downloaded from CPAN is called a "distribution", so RPerl itself is a CPAN distribution, and each of RPerl's CPAN (sub)dependencies is also a CPAN distribution.

We have tried to minimize the number of dependencies upon which RPerl relies, but even 1 dependency can itself have dozens or hundreds of subdependencies. RPerl is a relatively complex piece of software, with a moderate number of dependencies and (by association) a high number of subdependencies. If there is an error with even 1 of the numerous (sub)dependencies, it will either cause the RPerl installation to immediately terminate, or the installation will attempt to proceed in a partially-broken state, which will likely result in further errors when the broken software components are automatically tested.

CPAN promotes the use of a minimum version requirement for each (sub)dependency, which means RPerl will not use out-of-date software from CPAN, thereby avoiding a large number of known errors. However, there is much less emphasis on utilizing a maximum version requirement, so CPAN will always default to installing the latest public release of each (sub)dependency. It is the responsibility of the author of each (sub)dependency to ensure all public releases are stable, secure, and bug-free across a wide range of operating systems and hardware platforms, which is a constantly moving target for each software developer because new operating systems and hardware platforms are released all the time. This means the latest public release of any 1 specific CPAN distribution may remain stable for a long time, and then all-of-a-sudden it may become unstable when a new version of one of its own (sub)dependencies is released containing a bug or incompatibility. This further means the latest public release of the RPerl distribution on CPAN may work fine on your computer, then stop working properly if you update either the base dependencies or the CPAN dependencies.

There are 2 primary front-end applications used to install dependencies from CPAN, namely the `cpan` and `cpanm` commands, each of which has its own style of problem messages. In addition, there are unique problem messages which may be emitted by each individual CPAN distribution. As with the base dependencies in the previous RPerl install phase, there are countless possible problem messages you may encounter in this phase, none of which would be generated by RPerl itself.

Common problem messages in this phase may include:

Getting RPerl to work correctly does not simply require hitting a moving target, it requires hitting dozens of moving targets simultaneously. Most of the authors of CPAN distributions are volunteers, and software bugs happen even when the developers are well-paid, so don't be surprised if you encounter a problem during this phase. If you are convinced that you have found an as-yet undiscovered problem, please see:

"Section 1.18: How Can I Obtain Technical Support For RPerl?"

and

"Section 1.20: Are There Any Bugs In RPerl?"

The RPerl development team is in communication with the developers of several of the primary CPAN dependencies, so if you find a bug we may be able to help find the right people to fix it.

Section 2.3.3: Install, Test Suite

The final step of installing RPerl is to pass all tests in the RPerl automatic test suite, which can only occur after all base and CPAN dependencies are properly installed. The RPerl test suite includes a few thousand individual test cases, covering all parts of the RPerl compiler. If any of the RPerl tests fail or generate a problem message, then there may be a problem with RPerl itself, and we would very much appreciate your help by submitting a bug report:

"Section 1.20: Are There Any Bugs In RPerl?"

Common problem messages in this phase may include:

Section 2.3.4: Initialize, Bootstrap Subcompile

Each time the RPerl compiler is called, it may utilize a number of its own internal "helper function" components which are written using C++, so those components must first be compiled using a C++ compiler and linked to Perl 5 using Inline::CPP, during the "bootstrap subcompile" phase. In common English usage, "bootstrap" refers to "pull yourself up by your boot straps", which is the seemingly-impossible act of lifting one's self off the ground by pulling upward on one's shoes, and which means the ability to create something from nothing or to work some force upon one's own self. In compiler terminology, bootstrap means a compiler which can partially or fully compile itself, and is used as a measure of a compiler's long-term progress because only the most mature compilers can bootstrap. In this specific use of the term as part of the bootstrap subcompile phase, the term bootstrap means the need to use Inline::CPP to compile a small part of the RPerl compiler, which is then used to further compile other RPerl system and application code.

RPerl is a compiler, which itself utilizes 1 or more previously-existing compilers, the use of which we will call "subcompile" actions. The primary subcompile actions utilize the C++ compiler, because both "Initialize, Bootstrap Subcompile" and "Compile, Subcompile (Generate Binary)" are C++ subcompile phases. Only RPerl system developers will be exposed to the additional non-C++ subcompile actions.

Common problem messages in this phase may include:

Section 2.3.5: Initialize, Configuration

The second half of RPerl initialization involves the automatic configuration of multiple internal software settings. This phase is primarily comprised of setting the file and directory paths to important components of your operating system and of RPerl itself.

Common problem messages in this phase may include:

Section 2.3.6: Compile, Arguments & Files

RPerl is a compiler, so it is unsurprising that 9 of the 15 RPerl phases are compile phases. When a software developer (like you!) runs the RPerl compiler via the `rperl` command, the first compile phase checks the validity of the command-line arguments (AKA options) as well as the input source code file(s) to be compiled. Command-line arguments can be entirely omitted, in which case default behavior will be utilized. At least one input file must be specified, unless you are running `rperl -?` or `rperl -v` instead of a full RPerl compile command.

For details about all of RPerl's command-line arguments, please see Appendix B:

"APPENDIX B: RPERL COMMAND-LINE ARGUMENTS"

Common problem messages in this phase may include:

Section 2.3.7: Compile, Dependencies

Not to be confused with base dependencies in section 2.3.1 and CPAN dependencies in section 2.3.2, a normal RPerl "dependency" is one piece of RPerl application software which another piece of RPerl application software requires for proper functionality. Every RPerl application may have 0, 1, or any number of dependencies. As in normal Perl, an RPerl dependency is specified in source code via the use Some::Dependency; operator.

If your RPerl application has 1 or more dependencies, then RPerl must compile each dependency before it compiles your main RPerl application. If any of the dependencies rely upon subdependencies (and so forth), then each subdependency must be recursively checked for its own dependencies and they must all be compiled as well. In other words, before your main RPerl application can be compiled, RPerl must first compile all dependencies, subdependencies, etc. If the compilation of dependencies is disabled via the --nodependencies or --mode dependencies=OFF command-line arguments, or if there is a problem encountered while attempting to compile one of the dependencies, then compilation of your main RPerl program will also almost certainly experience a problem or total failure.

During this phase of compilation, RPerl makes a list of all dependencies which need to be compiled; however, the actual compiling of each dependency's source code input file does not occur until the following compile phases. This phase only produces an assessment of what files need to be compiled, and in what order.

Common problem messages in this phase may include:

Section 2.3.8: Compile, Parse Phase 0 (Check Perl Syntax)

This is the first phase where the input source code files are thoroughly checked for lexical and syntax errors. The term "parse" means to read through some structured data input such as human-readable plain-text source code, and a "full parse" results in the computer storing some "intermediate representation" (IR) which is a more computer-friendly encoding of the original source code. All 3 parse phases are full parses, although only the final intermediate representation is utilized by RPerl.

The normal Perl interpreter is called in "check syntax only" mode with all "strictures" enabled, as well as all warnings enabled and set to "fatal", via the `perl -M"warnings FATAL=q(all)" -Mstrict -cw`> command. Strictures are additional built-in rules which normal Perl can optionally apply during parsing a source code input file, and a fatal warning is essentially upgraded to become the same thing as an error because execution will terminate instead of continuing. All RPerl source code files are required to have strictures and (non-fatal) warnings enabled via the use strict; use warnings; statements, but RPerl won't check for the inclusion of those statements until the following parse phase 1 (criticize Perl syntax); thus, we make sure both strictures and warnings are automatically enabled in this step, in order to catch normal Perl errors sooner rather than later.

All problems encountered in this step are triggered by normal Perl, not by RPerl itself, and it is impractical to attempt to embed all pertinent Perl documentation within this RPerl documentation. Please utilize the following resources (in this order) to help reach a solution:

Common problem messages in this phase may include:

Section 2.3.9: Compile, Parse Phase 1 (Criticize Perl Syntax)

This phase of compilation checks the input source code files to make sure they all follow Perl "best practices" as originated in Damian Conway's book Perl Best Practices and implemented in Jeffrey Thalhammer's software Perl::Critic. A best practice is a rule or strategy which is commonly accepted by a professional community as being the most reliable and successful technique for approaching some specific task.

In this case, the professional community is our Perl computer programming community, and the Perl best practices themselves have been reduced to practice (get it?) by automatic source code checking via Perl::Critic and the PPI Perl pseudo-parser. In this phase of parsing, Perl::Critic is configured to operate in the most restrictive mode possible, which is called severity level 1, AKA "brutal" mode, and enables all possible Perl::Critic "policies" (rules). Although RPerl purposefully utilizes brutal mode, there are some Perl::Critic policies which either conflict with other policies (yes that's bad) or with RPerl itself. To compensate for this, RPerl application developers must selectively disable each problematic policy one-at-a-time using ## no critic statements, which are explained throughout this book where applicable. For a complete list of all accepted no-critic statements, please see Appendix C:

"APPENDIX C: RPERL CRITICS"

In addition, RPerl-specific best practices are included throughout this book where explicitly indicated, although the RPerl best practices are not currently checked or enforced in any way, and you should be careful not to confuse Perl best practices with RPerl best practices. It is assumed the RPerl application developers already have enough flaming hoops to jump through, in the form of the 3 existing RPerl parse phases, so RPerl best practices are not required, only suggested.

You may wish to purchase or download Damian Conway's book from our friends at O'Reilly Media, especially because its page numbers are referenced in this phase's error messages:

http://shop.oreilly.com/product/9780596001735.do

If you can't afford to purchase the book at this time, then you can still review a free list including all of Mr. Conway's Perl best practices in Johan Vromans' reference guide, although currently we have not yet made a mapping from policy numbers to the corresponding book page numbers:

http://rperl.org/docs/PBP_refguide-1.02.00.pdf

You will also likely want to refer to the list of Perl::Critic policies and the explanations thereof:

http://search.cpan.org/~thaljef/Perl-Critic/lib/Perl/Critic/PolicySummary.pod

Common problem messages in this phase may include:

Section 2.3.10: Compile, Parse Phase 2 (Parse RPerl Syntax)

The final parse phase of compilation utilizes RPerl's actual grammar to check the input source code files for valid RPerl syntax. The RPerl grammar implements a restricted subset of the normal Perl 5 language. This phase follows naturally from the previous parse phases: RPerl syntax (parse phase 2) is more restrictive than Perl::Critic brutal severity (parse phase 1), which is in turn more restrictive than normal Perl with fatal warnings and strictures enabled (parse phase 0), which is again more restrictive than normal lax Perl (not needed for RPerl parsing).

Some error messages may also include a "Helpful Hint" indicating a possible cause or solution; computer language parsing can be a very complicated process and there is no guarantee the hint is correct, so each hint should be taken with a grain of salt (a dose of healthy skepticism).

Full documentation of the RPerl grammar is provided in Appendix D:

"APPENDIX D: RPERL GRAMMAR"

Technical support for commercial RPerl users is provided by APTech, as detailed here:

"Section 1.18: How Can I Obtain Technical Support For RPerl?"

Free technical support for non-commercial users is provided by the RPerl community, as detailed here:

"Section 1.19: Are There Any Free Technical Support Options?"

Common problem messages in this phase may include:

Section 2.3.11: Compile, Generate (C++ Syntax)

After all parse phases of compilation have concluded, the original human-readable plain-text input source code files have been converted into an intermediate representation (IR) format known as an "abstract syntax tree" (AST), which exists in the computer's random-access memory (RAM or just "memory"), and which is more easily analyzed or modified by a computer.

Plain-text source code formats can be readily understood by trained humans, but are nearly impossible for computers to directly understand; compilers like RPerl exist for the sole purpose of converting source code into format(s) more easily understood by a computer. Intermediate representation formats such as RPerl's AST or normal Perl's "operation tree" (AKA "optree") or other languages' "bytecode" can be understood by humans and computers alike, but with a significant amount of difficulty for both. Binary code formats such as compiled executables or libraries can be readily understood by properly-configured computers, but are nearly impossible for humans to directly understand.

(The process of "decompiling" or "reverse engineering" the original human-readable source code from an optimized binary format is comparable to unscrambling an egg; it is currently considered nearly impossible to fully achieve, thus the act of compilation is considered to be one-way process and a strong protection against other people viewing your original source code or copying your intellectual property. You should be careful not to confuse decompiling with uncompiling: decompiling accepts as input a compiled binary file and then produces as output human-readable source code, while uncompiling accepts as input human-readable source code and then takes the action of deleting all compiled binary files which were produced from the source code during previous compilations using the same file names. Decompiling is not widely considered to be possible, while uncompiling is one of the more frequently utilized RPerl command-line arguments.)

During this phase of compilation, RPerl converts the AST format back into human-readable source code, but in the C++ language syntax (by default) instead of the RPerl language syntax. This is called the "generate" phase and is, in general, the exact opposite of a full parse phase; parsing converts from source code to IR, while generating converts from IR to source code. When configured to operate in test mode, RPerl does not generate C++ source code during this phase, but instead generates back into the exact same RPerl source code as the original input file. Test mode is thus able to check the generated RPerl source code to make sure it matches the original source code line-by-line, and also thereby "test" the RPerl parser's internal system code and "test" for the syntax correctness of the original source code, without the complexity of new C++ code becoming involved.

When compared to normal Perl, the 2 primary benefits of the RPerl compiler come from generating C++ source code: high-performance runtime speed optimization, and intellectual property source code protection. The act of converting from one human-readable computer language source code into another is compared to the similar act of converting from one spoken (or written) human language to another; this is called the "translate" process. When computer language translation is utilized as part of a compiler, the software is called a "transpiler" or "translating compiler". RPerl is a transpiler.

Section 2.3.12: Compile, Save Phase 0 (Final File Modifications)

During this phase, a number of changes and updates may be made to the output files. Most of these changes were delayed from previous phases, due to as-yet-incomplete compile-time information. This is the final phase during which RPerl directly generates output source code.

Section 2.3.13: Compile, Save Phase 1 (Format & Write Files To Disk)

This phase may utilize source code formatting software, including Perl::Tidy for Perl output code as well as the Artistic Style formatter for C++ output code. If an error occurs with one of these third-party code formatters, please review the applicable documentation.

After the output files have been formatted, they are saved to your computer's storage, which usually means the output files are written to hard-disk or the equivalent. Please be sure you have free disk space and the proper disk permissions to save files, etc.

Section 2.3.14: Compile, Subcompile (Generate Binary)

The final compile phase utilizes your pre-installed C++ compiler software, thus it is termed the subcompile phase. Please be sure you have a subcompiler installed which supports the C++11 standard, such as GNU GCC v4.7 and Clang v3.3, among others. If an error occurs during this phase, please refer to the applicable subcompiler documentation.

Section 2.3.15: Execute

If RPerl has been provided with exactly one file as input, and if that file is an executable Perl program file ending with .pl, then RPerl may optionally execute the program after all previous phases have been completed.

If RPerl is utilized in test mode, then the original non-compiled Perl program input file will be executed via the Perl interpreter with RPerl extensions. If RPerl is utilized in normal compile mode, then the compiled Perl program output file will be executed without the Perl interpreter.

Section 2.4: Variables With Scalar Values

An RPerl "expression" is any general-purpose language component which either returns a value or is a literal value itself.

An RPerl "statement" is any general-purpose language component which performs some action(s).

An RPerl "named operator" is any of the 220+ Perl named operators, although RPerl only supports the low-magic forms of each operator.

An RPerl "operation" is the equivalent of a single sentence in human language, and may be either an expression followed by a ; semicolon punctuation character, or a named operator followed by a semicolon, or a statement.

The equal-sign = is the assignment operator, used to set the variable on its left to store the value of the expression on its right.

Perl's my keyword is used to declare a new variable, and optionally initialize it to a starting value when combined with the = assignment operator.

Normal Perl does not support specific data types, so in Perl one variable named $foo may be initialized with a numeric value, then the same $foo variable may be changed to hold a string value without any warning or error.

    my $foo = 23;
    $foo = 'twenty-three';  # just fine in normal Perl

On the other hand, RPerl requires the use of data types for each and every variable.

    my $foo = 23;  # error in RPerl, all modes

    my number $foo = 23;
    $foo = 'twenty-three';  # error in RPerl, compiled (non-test) modes, assigning string literal to number variable

    my number $foo = 23;
    $foo = 42;  # just fine in RPerl

Normal Perl provides a special literal value undef, along with a built-in operator named defined which can be used to test if a variable contains the special undef value. RPerl does not support the undef value, nor use of the defined operator with scalar variables, because C++ does not fully support the concept of undefined values, and RPerl's high-speed components are written using C++.

Future versions of RPerl will likely provide a flexible scalar data type, which will offer the trade-off of higher memory usage in exchange for the ability to use one variable with multiple possible data types. Currently, all RPerl variables must hold exactly one specific data type only.

Data types make your code much more readable and much, much faster. Learn to love data types. Now. :-)

Section 2.4.1: How To Select Expressive Variable Names

Your choice of variable names will have a significant impact on the readability of your source code, whether written in Perl or any other programming language. If the names of your variables make sense, then it will be much easier for other programmers to understand your code. Also, good variable names will help you to understand your own code, after not looking at it for a few months or years. Variable names may include letters, numbers, and underscores.

BEST PRACTICES

  • Variable names should be comprised of lowercase letters, numerals, and underscore characters only.
  • Variable names containing multiple words should use underscores to separate each word.

You may not choose to utilize an all-uppercase variable name, because RPerl reserves that format for the names of constants; please see "Section 2.5: Constant Data" for more information. If you really insist, then RPerl will allow you to use mostly-uppercase variable names, as long as at least one lowercase letter is included. Also, instead of underscores you may choose to utilize "CamelCase" in order to separate words in your variable names. However, neither of these variable naming practices are recommended.

In addition to the best practices listed above, a good variable name should also fulfill all 3 of the following:

BEST PRACTICES

  • Meaningful: Actually describe what data the variable will hold.
  • Medium Length: Not too long, and definitely not too short.
  • Simple: Two or three common words separated by underscores is better than one obscure word.

The first of these three best practices is for your variable names to be meaningful, so you should never actually choose $foo or $bar as your variable names, unless you are writing a programming textbook and need to make up a bunch of small stand-alone code examples. When first choosing a variable's name, take a moment and simply think of the most descriptive word or words which explain what it is you plan to use the variable for. Don't worry if your variable name starts out a bit too long, you can always shorten it later if appropriate.

The second best practice is for your variable names to be of medium length, so you should usually not choose either $r (too short) or $rounding_error_from_perl_arithmetic_operators (too long); instead, how about $rounding_error (just right)? To be fair, there are a number of commonly-accepted single-letter variable names including $i and $j and $k for loop iterators (see "Section 2.9.1: Loop Iterator Variables" for more information), as well as traditional mathematics variables such as $n (a "number" of items) and $x and $y and $z (Cartesian coordinates).

The third and final best practice is for your variable names to be simple, so you should usually choose $golden_ratio instead of $phi because more people will understand what the "golden ratio" is, and also the greek letter "phi" is ambiguous as it is traditionally used for multiple different mathematical purposes. On the other hand, most people will understand what you mean if you give a numeric variable the name $circumference, so you probably do not need to utilize the longer name $perimeter_length instead.

Try your best to choose good variable names, you will thank yourself for it later.

Section 2.4.2: Boolean Data Type

The most efficient data type is boolean, which is a numeric type which stores a single "bit" (binary digit) of information. A boolean may only hold the values of exactly 0 or 1.

    my boolean $foo = 0;     # fine
    my boolean $bar = 1;     # fine
    my boolean $baz = -1.5;  # error in RPerl, compiled (non-test) modes

Section 2.4.3: Unsigned Integer Data Type

The second most efficient numeric data type is unsigned_integer, which stores a single whole (non-decimal) number which must have a value of 0 or greater. An unsigned_integer may not hold a negative number, and must fit within the data size limits of the data types supported by your operating system software and computer hardware.

    my unsigned_integer $foo  = -23;     # error in RPerl, compiled (non-test) modes
    my unsigned_integer $bar  = 0;       # fine
    my unsigned_integer $baz  = 42_230;  # fine
    my unsigned_integer $bax  = 42.1;    # error in RPerl, compiled (non-test) modes
    my unsigned_integer $quux = 999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999;  # likely error or data corruption, outside data type limits

Section 2.4.4: Integer Data Type

The third most efficient data type is integer, which stores a single whole (non-decimal) number. An integer may hold any positive or negative whole number, within your computer's data type limits.

    my integer $foo  = -23;     # fine
    my integer $bar  = 0;       # fine
    my integer $baz  = 42_230;  # fine
    my integer $bax  = 42.1;    # error in RPerl, compiled (non-test) modes
    my integer $quux = -999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999_999;  # likely error or data corruption, outside data type limits

Section 2.4.5: GMP Integer Data Type

The GNU Multi-Precision (GMP) library provides an arbitrary-precision integer data type which is accessible in RPerl as the gmp_integer type, and which does not need to fit within your computer's data type limits. All RPerl source code files which make use of GMP features must include the use rperlgmp; command. This tells RPerl to load the GMP library and other supporting software components, which are not normally loaded by the RPerl compiler.

The GMP integer data type works differently than the other normal data types, because a gmp_integer can be considered a combination of a normal integer data type and a special software component known as an "object". (Please see "CHAPTER 11: CLASSES, PACKAGES, MODULES, LIBRARIES" for more information on objects.)

When we declare a new variable of the gmp_integer data type, there is a 2-part initialization procedure, as opposed to a normal variable which can be declared and initialized as part of a single RPerl statement. The first initialization statement is to declare our new gmp_integer variable and pre-initialize it with a special subroutine called a "constructor", which is achieved by calling gmp_integer->new() in your RPerl source code. The second initialization statement completes the procedure by calling one of the three following GMP initialization subroutines:

The gmp_init() subroutine sets the variable's value to a default of 0. The gmp_init_set_unsigned_integer() and gmp_init_set_signed_integer() subroutines set the variable's value to the provided numeric literal value. When you utilize the three example subroutine calls above, you should replace $new_gmp_var with the name of your own gmp_integer variable which has already been pre-initialized via gmp_integer->new(), and also replace POSITIVE_INTEGER_VALUE or INTEGER_VALUE with an actual numeric literal value.

In the source code example below, we create three new gmp_integer variables which are named $foo, $bar, and $baz, and which are initialized to the values of 0, 23, and -23 respectively.

    use rperlgmp;

    my gmp_integer $foo  = gmp_integer->new();
    my gmp_integer $bar  = gmp_integer->new();
    my gmp_integer $baz  = gmp_integer->new();

    gmp_init($foo);
    gmp_init_set_unsigned_integer( $bar, 23 );
    gmp_init_set_signed_integer( $baz, -23 );

Please see "Section 2.4.14: GMP Integer Operators" for more information.

Section 2.4.6: Number Data Type

The number data type stores a single floating-point (decimal) number, and may hold any real number within your computer's data type limit.

    my number $foo  = -23.42;     # fine
    my number $bar  = 0.000_001;  # fine
    my number $baz  = 42.23;      # fine
    my number $bax  = 42;         # fine
    my number $quux = -4_123.456_789_123_456_789_123_456_789_123_456_789_123_456_789_123_456_789_123_456;  # likely error or data loss, outside data type limits

Section 2.4.7: Character Data Type

The character data type stores a single text character, which may include any letter or number or special character, and which must not be empty.

    my character $empty = '';    # error in RPerl, compiled (non-test) modes, can't be empty
    my character $foo   = 'f';   # fine
    my character $bar   = q{b};  # fine
    my character $baz   = '7';   # fine
    my character $bax   = "\n";  # fine, newline counts as a single character
    my character $quux  = 'ab';  # error in RPerl, compiled (non-test) modes, too many characters

Section 2.4.8: String Data Type

The string data type stores one or more text characters, which may include any letter or number or special character, and which may be empty.

    my string $empty = '';         # fine
    my string $foo   = 'f';        # fine
    my string $bar   = q{bar};     # fine
    my string $baz   = '7baz';     # fine
    my string $bax   = "\n\n\t";   # fine, counts as 3 characters
    my string $quux  = 'abcdefg';  # fine

Section 2.4.9: Type Conversion

To convert from one data type to another, we use the RPerl data type conversion subroutines, which are automatically available for use in all RPerl source code. All 7 RPerl data types can be converted to any of the other 6 RPerl data types, so in total there are 42 type conversion subroutines. Remember the 7 RPerl data types are as follows:

The name and calling convention of each subroutine takes the form SOURCE_to_DESTINATION(INPUT), where SOURCE is replaced by the current data type from which you are converting, DESTINATION is replaced by the future data type to which you wish to convert, and the argument INPUT is replaced by the input data which should have the data type of SOURCE. Your INPUT can be any source of scalar data such as a variable, a literal, a constant, or a subroutine which returns a scalar value. The return value of each type conversion subroutine has a data type of DESTINATION, so if you want to store the newly-converted return value for later use, then you will need to make sure you utilize a variable with the correct type.

For example, if you want to convert from the input of a string variable named $foo to the output of a number data type, then you simply call string_to_number($foo). If you want to store the number return value of this type conversion, then you must utilize a variable of number type to do so. Thus, if you are providing a variable as the input argument for a type conversion subroutine, and if you are also storing the type conversion's return value in a variable, then these must always be two different variables because they will always have two different data types.

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils

    # [[[ OPERATIONS ]]]
    my string $foo = '723.555_777';
    my number $bar = string_to_number($foo);
    my number $double_bar = $bar * 2;
    print 'have $bar = ', number_to_string($bar), "\n";
    print 'have $double_bar = ', number_to_string($double_bar), "\n";

If we run our code example above, we should receive the following output, which correctly displays the underscore digit separator for long numbers:

    have $bar = 723.555_777
    have $double_bar = 1_447.111_554

BEST PRACTICES

  • Numeric data types should always be converted to a string data type before being passed to print or otherwise displayed or saved.

If we do not call the number_to_string() subroutine before the values of $bar and $double_bar are processed by the print operator, then our output will be missing the proper underscore separators:

    print 'have $bar = ', $bar, "\n";                # no number_to_string(), not best practices
    print 'have $double_bar = ', $double_bar, "\n";  # no number_to_string(), not best practices

We no longer receive the exact output desired:

    have $bar = 723.555777
    have $double_bar = 1447.111554

In most cases, the shorthand subroutine to_string() may be used instead of one of the other SOURCE_to_string() type conversion subroutines. The special to_string() data type conversion subroutine will attempt to automatically detect the data type of the input argument value. We will receive the correct desired output if we replace number_to_string() with to_string() in our example above:

    print 'have $bar = ', to_string($bar), "\n";
    print 'have $double_bar = ', number_to_string($double_bar), "\n";

BEST PRACTICES

  • Numeric data should always include proper underscore separators, even when stored inside a string data type.

If the variable $foo was a number instead of a string, then RPerl would give an ERROR ECOPARP00 parse error and complain of Long number not separated with underscores. However, since $foo is of the string data type, then it is not checked for proper numeric underscores, and thus we can omit the underscore in the value of $foo without errors and with the proper output as seen above. However, this is not best practices, and we should consider the underscore to be missing in the following:

    my string $foo = '723.555777';  # no underscore, not best practices

Following is a list of all 42 RPerl data type conversion subroutines, not counting the special shorthand to_string() subroutine:

You must utilize type conversions when you are assigning the value of one variable to another variable of a different data type, otherwise you will have a type mismatch which will probably cause an RPerl sub-compile error. Be aware that you may experience data loss when converting to a data type capable of storing less information:

    my integer $foo = 23;
    my number $bar  = $foo;  # error in RPerl, compiled (non-test) modes, type mismatch

    my integer $foo = 23;
    my number $bar  = integer_to_number($foo);  # fine, $bar is now 23.0

    my number $foo  = 23.42;
    my integer $bar = $foo;  # error in RPerl, compiled (non-test) modes, type mismatch

    my number $foo  = 23.42;
    my integer $bar = number_to_integer($foo);  # fine, $bar is now 23, data loss has occurred

Section 2.4.10 Scope, Type, Name, Value

Every variable has 4 primary qualities, all of which are provided during variable initialization, in the following order:

The "scope" of a variable describes where in the source code the variable is valid and available for use, and does not change after your RPerl software starts to execute. In normal Perl, you will find a combination of locally-scoped variables which are declared the my keyword, globally-scoped variables declared using the our keyword, and state variables declared using the state keyword. (Old Perl 4 programmers also made use of the local keyword, now replaced by my.)

Local variables are only usable within their own enclosing code block, such as the body of a conditional statement, loop, or subroutine. If a local variable is declared in the main operations section of an RPerl program file, then it is considered to be outside of any code block, and it may only be utilized in the operations section of its own program file.

Global variables are usable within any code block accessible by Perl. Except for certain special variables such as our $VERSION, all variables in RPerl are locally-scoped using the my keyword. RPerl does not currently support state variables.

The "type" of a variable is simply the data type which is stored inside the variable, as described in detail throughout the preceeding subsections of "Section 2.4: Variables With Scalar Values". A variable's type must be set during declaration and may not be altered thereafter. Remember there are 7 scalar data types in RPerl: boolean, unsigned_integer, integer, gmp_integer, number, character, and string.

The "name" of a variable is the word or phrase you type after a dollar-sign $ character, in order to access the variable. Like the scope and type, a variable's name may not be changed after the point of variable declaration, when the software begins execution. Closely related is the "namespace" of a variable, which is a named group of variables and constants and possibly other things; except where specified in "CHAPTER 11: CLASSES, PACKAGES, MODULES, LIBRARIES", all RPerl variables exist outside of any namespace, which is to say that normal RPerl variables do not have a namespace.

The "value" of a variable is the actual data which has been stored inside of the variable, and which may be accessed or modified as your RPerl software is running. A variable's value must be compatible with its type, otherwise an error or undefined behavior may occur.

When your RPerl application is running in normal interpreted mode AKA test mode (Perl operations and Perl data types), then you will be able to access the special subroutines type(), types(), name() and scope_type_name_value(). These four subroutines provide the ability to perform "introspection" on RPerl variables, which means your RPerl application can access information about itself, specifically about its own variables. The types() subroutine is not used for scalar data types as discussed in this chapter, please see "CHAPTER 3: ARRAY VALUES & VARIABLES" and "CHAPTER 6: HASH VALUES & VARIABLES" for more information.

The following source code example shows a simple string variable named $foo (as usual), which we then pass as an argument to the introspection subroutines and display the results:

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils

    # [[[ OPERATIONS ]]]

    my string $foo = 'howdy';
    print 'have $foo = ', $foo, "\n";
    print 'have type($foo) = ', type($foo), "\n";
    print 'have name($foo) = ', name($foo), "\n";
    print 'have scope_type_name_value($foo) = ', "\n", scope_type_name_value($foo), "\n\n";

When this example is executed, the following output is produced:

    have $foo = howdy
    have type($foo) = string
    have name($foo) = $foo
    have scope_type_name_value($foo) = 
    my string $foo = 'howdy';

Section 2.4.11: Binary Assignment Operators

Often you will want to increase a numeric variable by some value, which can be achieved by calling the addition + operator, followed by the assignment = operator:

    $foo = $foo + 1;

We can shorten this example from two operators to only one by utilizing a "binary assignment operator", in this case the "plus-equals" += operator:

    $foo += 1;

We can achieve this effect for the four arithmetic operators addition +, subtraction -, multiplication *, and division /, as well as the string concatenation . operator. The right-hand side argument is not limited to numeric literals such as 1 in the above example, any expression can be utilized as long as it is of a compatible data type, as always.

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Add Assign

+=

Binary

Infix

19

Right

Coming Soon

Subtract Assign

-=

Binary

Infix

19

Right

Coming Soon

Multiply Assign

*=

Binary

Infix

19

Right

Coming Soon

Divide Assign

/=

Binary

Infix

19

Right

Coming Soon

String Concatenate Assign

.=

Binary

Infix

19

Right

Coming Soon

Section 2.4.12: Auto-Increment & Auto-Decrement Operators

Recall the example from the previous section, regarding the use of binary assignment operators, where we first see long-hand source code using two operators:

    $foo = $foo + 1;

Then we shorten from two operators to only one operator:

    $foo += 1;

Now, for the special case of incrementing by a hard-coded integer value of 1, we can shorten this example even further to eliminate the numeric literal. This is achieved by using the post-increment ++ operator:

    $foo++;

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Pre-Increment

++

Unary

Prefix

03

Non

Yes

Post-Increment

++

Unary

Postfix

03

Non

Yes

Pre-Decrement

--

Unary

Prefix

03

Non

Yes

Post-Decrement

--

Unary

Postfix

03

Non

Yes

    ++$i  # 0

Section 2.4.13: chop & chomp Operators

Often you will want to remove the special newline character "\n" from the end of a string variable, which can be achieved safely by using the chomp operator. When you pass some variable as the operand to chomp, the operator will do nothing if the operand's final character is not a newline. Chomp knows how to use the correct newline character for each operating system, thanks to Perl's magic $INPUT_RECORD_SEPARATOR variable.

    my string $foo = 'howdy' . "\n";
    print $foo;
    print $foo;

    chomp $foo;  # newline character removed
    print $foo;
    print $foo;

    chomp $foo;  # no effect
    print $foo;
    print $foo;

In the example source code above, only the first call to the chomp operator has an effect on the value of the $foo variable. When you run this example source code, you should receive the following output:

    howdy
    howdy
    howdyhowdyhowdyhowdy

Sometimes you will want to trim the final character of a string variable, regardless of whether it is a newline or any other specific character. For these cases, you will want to use the chop operator, which is likened to a non-safe variant of chomp. Obviously, chop and chomp actually perform two different operations, so you will always need to be careful and choose which operator to utilize on a case-by-case basis.

Let's modify the previous code example to use the chop operator instead of chomp:

    my string $foo = 'howdy' . "\n";
    print $foo;
    print $foo;

    chop $foo;    # newline character removed
    print $foo;
    print $foo;

    chop $foo;    # 'y' character removed
    print $foo;
    print $foo;

The trailing newline character is trimmed on the first call to chop, and then the final 'y' character of 'howdy' is deleted on the second call to the chop operator. This generates the following output, notice the missing 'y' characters:

    howdy
    howdy
    howdyhowdyhowdhowd

Section 2.4.14: GMP Integer Operators

The GMP library provides a large number of operators which are only for use with variables of the gmp_integer data type, and these GMP operators are made available in RPerl as subroutines which all begin with "gmp_". These GMP operator subroutines are loaded by the use rperlgmp; command.

gmp_integer_to_boolean gmp_integer_to_unsigned_integer gmp_integer_to_integer gmp_integer_to_number gmp_integer_to_character gmp_integer_to_string boolean_to_gmp_integer integer_to_gmp_integer unsigned_integer_to_gmp_integer number_to_gmp_integer character_to_gmp_integer string_to_gmp_integer

gmp_init gmp_init_set_unsigned_integer gmp_init_set_signed_integer

gmp_set gmp_set_unsigned_integer gmp_set_signed_integer gmp_set_number gmp_set_string gmp_get_unsigned_integer gmp_get_signed_integer gmp_get_number gmp_get_string gmp_add gmp_sub gmp_mul gmp_mul_unsigned_integer gmp_mul_signed_integer gmp_sub_mul_unsigned_integer gmp_add_mul_unsigned_integer gmp_neg gmp_div_truncate_quotient gmp_cmp

Section 2.5: Constant Data

Sometimes you have a piece of data that will never change once you compile your software, which is what we call "constant data" or just a "constant" for short.

Constants are the opposite of variables, because constants are designed not to change, while variables are designed to change as many times as needed.

Below is an example RPerl program with two constants, one containing numeric data and another containing text data:

    #!/usr/bin/env perl
 
    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;
 
    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitConstantPragma ProhibitMagicNumbers)  # USER DEFAULT 3: allow constants
 
    # [[[ CONSTANTS ]]]
    use constant PI  => my number $TYPED_PI  = 3.141_59;
    use constant PIE => my string $TYPED_PIE = 'pecan';

These two constants can be utilized as follows:

    my number $area = PI() * $r ** 2;
    my string $dessert = 'having a nice slice of ' . PIE();

As seen in the examples above, we must first declare each constant to have an all-uppercase name via the use constant command, after which we can access the stored data value by calling the constant name followed by empty parentheses. (In normal interpreted Perl, a constant is functionally equivalent to a subroutine which accepts no arguments and performs no operations other than to call the return operator with a hard-coded data value; see "CHAPTER 4: ORGANIZING BY SUBROUTINES" for more information.)

In all RPerl source code files which contain constants, we must include the "USER DEFAULT 3" no critic command; for numeric constants, we must also include "USER DEFAULT 1", as seen in the code example above.

All constants must utilize names with uppercase-only lettering, in order to distinguish them from similarly-named variables. Thus, a normal variable may not have an all-uppercase name. (Special variables called "file handles" must be all-uppercase; see "CHAPTER 5: READING & WRITING FILES" for more information.)

    use constant PIE => my string $TYPED_PIE = 'peanut butter';  # fine
    use constant Pie => my string $TYPED_Pie = 'coffee cream';   # error in RPerl, compiled modes
    use constant pie => my string $TYPED_pie = 'mocha silk';     # error in RPerl, compiled modes

    my string $PIE = 'oreo';        # error in RPerl, compiled modes
    my string $Pie = 'blackberry';  # okay
    my string $pie = 'blueberry';   # best

The declaration of constants follows a purposefully-repetitive format starting with the use constant command, followed by the all-uppercase constant name, a fat-arrow (AKA fat-comma) =>, the my command, the desired data type, a variable named $TYPED_ with the same all-uppercase name appended, the equal-sign = assignment operator, and finally the desired data value with a closing semicolon. Within a constant declaration, the all-uppercase constant name must appear identically in both places, which is a language construct enabling us to provide a data type where it would otherwise be impossible to do so.

    use constant PIE => my string $TYPED_PIE = 'cherry';        # fine
    use constant PIE =>                        'black forest';  # error in RPerl, compiled modes
    use constant PIE => my string $TYPED_PI  = 'grasshopper';   # error in RPerl, compiled modes
    use constant PIE => my string $TYPED_POE = 'raven';         # error in RPerl, compiled modes

You may access the value of a constant as many times as you like, but you may never change the value once the software has been compiled.

    $pie = PIE();       # fine
    $pie = $PIE;        # error in RPerl, compiled modes
    $pie = $TYPED_PIE;  # error in RPerl, compiled modes

    PIE()      = 'chocolate bourbon';   # error in Perl and RPerl
    $PIE       = 'chocolate chip';      # error in RPerl, compiled modes
    $TYPED_PIE = 'death by chocolate';  # error in RPerl, compiled modes

Constant values must be literals and not other constants or variables.

    use constant PIE => my string $TYPED_PIE = 'banana cream';  # fine
    use constant PIE => my string $TYPED_PIE = PI();            # error in RPerl, compiled modes
    use constant PIE => my string $TYPED_PIE = $pie;            # error in RPerl, compiled modes

It is always possible to use a variable instead of a constant, but not the other way around. You should use constants whenever possible, because they will always be at least as fast as variables, and often faster!

                        my string $pie       = 'lemon meringue';  # okay
    use constant PIE => my string $TYPED_PIE = 'key lime';        # best

Section 2.6: Displaying Output Using The print Operator

You will often want to display some text output while your RPerl application is running, and for this we use the print operator:

    print 'Hello, World!', "\n";

Unsurprisingly, calling print as shown above will display the 13-character global greeting, followed by a special "newline" character which moves the computer cursor down to the next line. (The "n" in "\n" stands for "newline", which is very closely related to the concepts of "line feed", "carriage return", and pressing the "Enter" key on a computer keyboard. The history of these terms harkens back to the days of manual typewriters and is worth spending a few minutes to learn about on Wikipedia, if you are into that sort of thing.)

Multiple operands are passed to the print operator by separating them with commas. The above example passes 2 operands to be received by the print operator, first the greeting literal and second the newline character. You may pass as many operands to print as you like.

You can also print the contents of variables and constants. For example, the 3 operations below produce the exact same output as the 1 operation above:

    use constant NEWLINE => my string $TYPED_NEWLINE = "\n";
    my string $greeting = 'Hello, World!';
    print $greeting, NEWLINE();

Section 2.6.1: STDOUT & STDERR

When you call the print operator, it sends all output to the operating system's default output data stream, which is a special component called "standard output" or just "standard out" for short, written using the STDOUT keyword in Perl. You may explicitly specify STDOUT for any print operator, although this is not necessary because it is already the default behavior. When specified, the output stream has an asterisk * character prepended, and is then wrapped in curly-braces.

So again, the following operation produces the exact same output as the 2 preceding examples:

    print {*STDOUT} 'Hello, World', "\n";

You will notice there is no comma after {*STDOUT}; you should only type a blank space character, and then specify the first operand. Other than that, the print operator works exactly the same both with and without the extra STDOUT included.

In addition to STDOUT, there is a second output data stream provided by your operating system, which is called "standard error" and is written using the STDERR keyword. This stream is used to display actual error and warning messages, as well as optional debugging and diagnostic information. The following operation will display an example warning message:

    print {*STDERR} 'WARNING: Danger, Will Robinson!', "\n";

Your software will continue running after you call print to send output to STDERR; if you want execution to immediately stop after displaying an error message, then you will need to explicitly add a termination operator such as exit or die or croak. The operand value of 1 passed to the exit operator below is used to inform the operating system that we have encountered an error. (Please see NEED_ADD_SECTION for more information.)

    print {*STDERR} 'ERROR: A fatal error or failure has occurred, aborting.', "\n";
    exit 1;

As mentioned above, you may also choose to use STDERR to display debugging and diagnostic messages, which are neither warnings nor errors. For example, let's say you are writing an RPerl application which primarily relies on the $foo variable, so your users want to see the value of $foo when the program is finished running. Your code also utilizes the secondary variables named $bar, $bat, and $baz, but the users don't care about the values of these variables, only you (and other software developers) care about them. Following is an example of using print for both normal output to STDOUT (by default), as well as diagnostic output to STDERR:

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils

    # [[[ OPERATIONS ]]]

    my number  $bar =    17.0;
    my number  $bat = 2_112.42;
    my integer $baz =    23;
    my number  $foo;

    print {*STDERR} 'before arithmetic, have $bar = ', number_to_string($bar), "\n";
    print {*STDERR} 'before arithmetic, have $bat = ', number_to_string($bat), "\n";
    print {*STDERR} 'before arithmetic, have $baz = ', integer_to_string($baz), "\n";

    $bat = $bar;
    $bar = $baz / ($bar * 3);
    $baz = $baz + 1;
    $foo = $bar + $bat;

    print '$foo = ', number_to_string($foo), "\n";

    print {*STDERR} 'after arithmetic, have $bar = ', number_to_string($bar), "\n";
    print {*STDERR} 'after arithmetic, have $bat = ', number_to_string($bat), "\n";
    print {*STDERR} 'after arithmetic, have $baz = ', integer_to_string($baz), "\n";

Now you may easily choose to view only the STDOUT, or STDERR, or both. Here we see a normal execution of the program with both output streams displayed:

    $ ./my_program.pl 
    before arithmetic, have $bar = 17
    before arithmetic, have $bat = 2_112.42
    before arithmetic, have $baz = 23
    $foo = 17.450_980_392_156_9
    after arithmetic, have $bar = 0.450_980_392_156_863
    after arithmetic, have $bat = 17
    after arithmetic, have $baz = 24

Next, we execute the same program again; this time we append > /dev/null after our program name, which discards all STDOUT output by redirecting it to a "null" or empty device. (In the Windows operating system, replace > /dev/null with > nul.) This results in only the STDERR output being displayed:

    $ ./my_program.pl > /dev/null
    before arithmetic, have $bar = 17
    before arithmetic, have $bat = 2_112.42
    before arithmetic, have $baz = 23
    after arithmetic, have $bar = 0.450_980_392_156_863
    after arithmetic, have $bat = 17
    after arithmetic, have $baz = 24

Lastly, we run the same program a third time, now with 2> /dev/null appended to discard all STDERR output, so we only see STDOUT:

    $ ./my_program.pl 2> /dev/null
    $foo = 17.450_980_392_156_9

It is easy to simply insert a hash character (AKA octothorpe) # in front of each print {*STDERR} operator to comment it out, thereby disabling the diagnostic output but allowing it to be re-enabled again by future developers:

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils

    # [[[ OPERATIONS ]]]

    my number  $bar =    17.0;
    my number  $bat = 2_112.42;
    my integer $baz =    23;
    my number  $foo;

    #print {*STDERR} 'before arithmetic, have $bar = ', number_to_string($bar), "\n";
    #print {*STDERR} 'before arithmetic, have $bat = ', number_to_string($bat), "\n";
    #print {*STDERR} 'before arithmetic, have $baz = ', integer_to_string($baz), "\n";

    $bat = $bar;
    $bar = $baz / ($bar * 3);
    $baz = $baz + 1;
    $foo = $bar + $bat;

    print '$foo = ', number_to_string($foo), "\n";

    #print {*STDERR} 'after arithmetic, have $bar = ', number_to_string($bar), "\n";
    #print {*STDERR} 'after arithmetic, have $bat = ', number_to_string($bat), "\n";
    #print {*STDERR} 'after arithmetic, have $baz = ', integer_to_string($baz), "\n";

Now we don't need any redirection on the command line to suppress STDERR output, and our program will run faster:

    $ ./my_program.pl
    $foo = 17.450_980_392_156_9

Section 2.7: Program Control Using The if Conditional Statement

Often you will want to perform a task, but only if some specific "condition" is met; this is called a "conditional statement" or just "conditional" for short, and is implemented using the if statement in Perl. You may also refer to the if statement as a "control structure", because it is a source code structure used to control the execution flow of a piece of Perl software. In other words, a conditional statement can control how your software runs, and so your software may run differently depending on your conditional statements.

Whether or not an if statement's task is actually performed is determined by the truth value of its condition, which is specified in the loop's "header", contained within parentheses immediately after the if keyword; please review "Section 2.1.9: Truth Values" for more information. The header of a conditional statement contains only its condition. The task to be conditionally performed is known as the conditional statement's "body", and is specified within curly-braces immediately after the header.

    if ( $my_integer == 17 ) { print 'I got seventeen!', "\n"; }

We can read the source code above as "if the value of the variable $my_integer is equal to 17, then display the text 'I got seventeen!' followed by a newline character"; or for short we can just say "if $my_integer is 17, print 'I got seventeen!'". Note the use of the word "then" in the long-form translation; if statements are also commonly referred to as "if - then" statements, although in Perl the word "then" is not used in actual source code. We will include the word "then" in the translations below, for ease of comprehension.

Section 2.7.1: Conditional Chaining With if & elsif & else

So, what if you also want to perform a different task when your if statement's condition is evaluated as false? Then use the else keyword with its own body:

    if ( $my_integer == 17 ) { print 'I got seventeen!',         "\n"; }
    else                     { print 'I did not get seventeen.', "\n"; }

We can read the two lines of code above as "if $my_integer is 17, then print 'I got seventeen!'; or else print 'I did not get seventeen.'".

Now let's add a third check, to see if we match one more possible value of $my_integer before declaring that these aren't the values (or droids) we're looking for. The following three lines of code can be read as "if $my_integer is 17, then print 'I got seventeen!'; or else if $my_integer is 23, then print 'I got twenty-three!'; or else print 'I did not get either.'":

    if    ( $my_integer == 17 ) { print 'I got seventeen!',      "\n"; }
    elsif ( $my_integer == 23 ) { print 'I got twenty-three!',   "\n"; }
    else                        { print 'I did not get either.', "\n"; }

When at least one elsif or else statement is appended to an if statement, it is called a "conditional chain".

Now let's look at the output values of some slightly less simple conditional statements. For example, perhaps you want to increment the value of an integer stored in the variable named $heart, but only until it reaches a value of 3. You can achieve this (in a contrived manner) by executing the following RPerl program:

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils

    # [[[ OPERATIONS ]]]

    my integer $heart = 0;
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    if ( $heart < 3 ) { $heart++; }
    if ( $heart < 3 ) { $heart++; }
    if ( $heart < 3 ) { $heart++; }
    if ( $heart < 3 ) { $heart++; }

    print '$heart = ', $heart, "\n";

What do you think will be the generated output from the code above? Well here it is:

    $heart = 0
    $heart = 3

By studying the source code and output above, we can see there are five opportunities for the value of the $heart variable to be incremented via the ++ operator, but only three of them are actually executed. This is because the truth value of $heart < 3 is true while $heart is equal to 0, 1, and 2, but then the truth value of $heart < 3 becomes false once the value of $heart reaches 3. Obviously, a value of 3 is not less-than another equal value of 3, so the final 2 truth values are false.

According to the generated output, it appears we have achieved our simple goal of incrementing $heart until it reaches 3, but it can be difficult to know exactly what is happening while the program is running. Let's add some print operators to display more useful output:

    my integer $heart = 0;
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    print '$heart = ', $heart, "\n";

Now we can see what's really going on:

    $heart = 0
    $heart = 1
    $heart = 2
    $heart = 3
    $heart = 3
    $heart = 3

From this more informative output, we can see the final two print operators are called, but the final two increment ++ operators are not called, so "$heart = 3" is repeated twice.

Each of the five if statements in the previous example are syntactically independent conditional statements, which means you can change any of their conditions without affecting the evaluation of the other conditions. In other words, all five conditions with all five less-than < operators are evaluated every time this program runs, even if only three of the five if statement bodies are executed. This is fine, because it is the correct behavior we want in this example.

Now let's say (for some odd reason) you want to chain three if statements together where they are now dependent upon one another, so if you change any one condition then it may affect the evaluation of the following condition(s). You can achieve this by changing the second and third if statements to be elsif statements instead:

    my integer $heart = 0;
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    elsif ( $heart < 3 ) { $heart++; }
    elsif ( $heart < 3 ) { $heart++; }

    print '$heart = ', $heart, "\n";

We can't include all the extra print operators because each elsif keyword must follow immediately after the preceding conditional body. Also, we can't link more than three if statements together without breaking the Perl Best Practices guideline against utilizing a "cascading if-elsif chain".

So what do you think the output will for the above conditional chain? Here you go:

    $heart = 0
    $heart = 1

This time only the first of the three increment ++ operators was executed, so the value of $heart only increased from 0 to 1. This is because an elsif statement's condition will automatically default to a truth value of false if any preceding if or elsif truth values in its conditional chain are evaluated as true. In other words, both elsif statements in this example were totally skipped or "short-circuited" without even evaluating their less-than < operators, because the condition of the starting if statement is evaluated as a true. It would literally be a waste of time to evaluate the second and third less-than operators, so our program will run a little bit faster by allowing Perl to automatically skip over those elsif statements completely.

Can you quickly tell what the output will be for the following example, where we have simply added an else statement?

    my integer $heart = 0;
    print '$heart = ', $heart, "\n";

    if ( $heart < 3 ) { $heart++; }
    elsif ( $heart < 3 ) { $heart++; }
    elsif ( $heart < 3 ) { $heart++; }
    else                 { $heart++; }

    print '$heart = ', $heart, "\n";

In this case, the same short-circuiting occurs for the else as does for the elsif statements in this and the previous example, so our output is unchanged:

    $heart = 0
    $heart = 1

The longest conditional chain allowable by Perl Best Practices is four members long and is comprised of an if statement, followed by two elsif statements, followed by an else statement, as seen in the example above. If all your conditions test a single variable for equivalence to some specific values, as in our first $my_integer examples in this section, then instead of a conditional chain you may be able to utilize a hash data structure with the keys pre-set to the possible matching values; please see "CHAPTER 6: HASH VALUES & VARIABLES" for more information.

If your conditions are not all simple equivalence tests via the equals == operator, as in our $heart examples where the conditions include less-than < operators, then Perl Best Practices directs us to use a given ... when statement instead of any if or elsif statements at all. However, RPerl does not yet support given ... when, so if it is unavoidable then you may disable the rule against cascading conditional chains for just one source code file at a time:

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    ## no critic qw(ProhibitCascadingIfElse)  # USER DEFAULT 9: allow cascading conditional chains until given-when is implemented
    
    # [[[ OPERATIONS ]]]
    
    my integer $foo = 0;
    print '$foo = ', $foo, "\n";
    
    if ( $foo < 3 ) { $foo++; }
    elsif ( $foo < 3 ) { $foo++; }
    elsif ( $foo < 3 ) { $foo++; }
    elsif ( $foo < 3 ) { $foo++; }
    else               { $foo++; }
    
    print '$foo = ', $foo, "\n";

As seen above, now we may add one or more additional elsif statements without triggering a Perl Best Practices violation error via Perl::Critic. This is achieved by the inclusion of the special "USER DEFAULT 9" no critic command, which disables the "ProhibitCascadingIfElse" policy. If we attempt to run this source code without the proper critic command, we will receive an error:

    $ rperl -t my_program.pl 
    ERROR ECOPAPC02, RPERL PARSER, PERL CRITIC VIOLATION
    Failed Perl::Critic brutal review with the following information:

        File Name:    ./lib/RPerl/Test/Conditional/program_08_bad_00.pl
        Line number:  22
        Policy:       Perl::Critic::Policy::ControlStructures::ProhibitCascadingIfElse
        Description:  Cascading if-elsif chain
        Explanation:  See Perl Best Practices page(s) 117, 118

However, with the proper "USER DEFAULT 9" no critic command in place, we receive no errors and the same output as before:

    $heart = 0
    $heart = 1

Section 2.7.2: Nested Conditionals

In all the examples above, the body of each conditional statement only includes one operation, such as a single increment ++ operator or a single print operator. However, the bodies of your conditional statements may include as many operations as you like. Also, conditional statements may be nested within one another:

    if ( $friend_enjoys_lasagna ) {
        if ( $other_lasagna_lovers_so_far == 0 ) {
            print 'Oh thank goodness, I thought I was the only one!', "\n";
        }
        elsif ( $other_lasagna_lovers_so_far == 1 ) {
            print 'Well, at least there are a few of us...', "\n";
        }
        else {
            print 'Welcome to the Lasagna Lovers club!', "\n";
        }
        $other_lasagna_lovers_so_far++;
    }
    else {
        print 'Why would anyone NOT like lasagna?', "\n";
    }

Section 2.7.3: Conditionals & Variable Declarations

When a variable is declared inside the body of any conditional statement, then said variable may only be utilized from within that same conditional body (or any nested bodies within it). In all the previous examples, we are implying unseen variable declarations which happen before (and thus outside) the conditional bodies, so all our example variables may be accessed or modified outside the conditional statements themselves.

If a variable is declared within a conditional's statement's body, then we will receive an error if we try to utilize it outside the conditional:

    my integer $foo = 0;
    print 'outside conditional, have $foo = ', $foo, "\n";
    
    if ( $foo < 3 ) {
        $foo++;
        my integer $bar = $foo * 2;
        print 'inside  conditional, have $bar = ', $bar, "\n";
    }
    
    print 'outside conditional, have $foo = ', $foo, "\n";
    print 'outside conditional, have $bar = ', $bar, "\n";  # THIS CAUSES A PARSE ERROR

If we run the code above, the last line attempts to access the $bar variable outside of the conditional body where it was declared, so we receive an error:

    ERROR ECOPAPL02, RPERL PARSER, PERL SYNTAX ERROR
    Failed normal Perl strictures-and-fatal-warnings syntax check with the following information:
    
        File Name:        my_program.pl
        Return Value:     2
        Error Message(s): 
    
    Global symbol "$bar" requires explicit package name at my_program.pl line XYZ
    my_program.pl had compilation errors.

If we simply remove the last line where $bar is incorrectly accessed, then our code runs fine:

    outside conditional, have $foo = 0
    inside  conditional, have $bar = 2
    outside conditional, have $foo = 1

Section 2.8: Receiving Input From The User & STDIN

When you want the users of your software to provide some keyboard input, then you will need to use the STDIN keyword in Perl, which represents the "standard input" data stream. This will allow the user to type one line of input text and numbers, ended by pressing the Enter key. All user input is received in text format, and should initially be stored in a variable with a string data type. Unless converted to a different data type via one of RPerl's type conversion subroutines, the string variable should be passed to the chomp operator, which will remove the trailing newline character collected by STDIN.

When used in Perl, we wrap STDIN with "angle-brackets", which start with the less-than < character and end with the greater-than > character. In this context, the less-than and greater-than characters are not used as tests for numeric inequality. Also, when using STDIN in RPerl we must always include the "USER DEFAULT 4" no critic command, which enables use of STDIN in general.

Along with the two output streams STDOUT and STDERR, the input stream STDIN makes up the third of three standard I/O ("input / output") data streams available on most operating systems.

The following example asks the user for their name, then greets them by name:

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]

    print 'Please input your first name: ';
    my string $first_name = <STDIN>;
    chomp $first_name;
    print 'Hello ', $first_name, ', nice to meet you!', "\n";

One possible execution of this example, with the input name "Will", produces the following output:

    Please input your first name: Will
    Hello Will, nice to meet you!

You may also receive numeric data input, with the use of the appropriate data type conversion subroutines:

    print 'Please input your age in years: ';
    my string $age_string = <STDIN>;
    my integer $age = string_to_integer($age_string);
    print 'You are ', integer_to_string($age), ' years old.', "\n";
    print 'In one year from now, you will be ', integer_to_string($age + 1), ' years old.', "\n";

In the above example, you may optionally omit the two calls to the integer_to_string() data type conversion subroutine, because it has no effect on any integer with a value less-than 1,000. One possible execution produces the following output:

    Please input your age in years: 34
    You are 34 years old.
    In one year from now, you will be 35 years old.

You may receive input from STDIN as many times as needed:

    print 'Please input your favorite band or musician: ';
    my string $favorite_band = <STDIN>;
    chomp $favorite_band;
    
    print 'Please input your favorite movie: ';
    my string $favorite_movie = <STDIN>;
    chomp $favorite_movie;
    
    print 'Please input your favorite TV show: ';
    my string $favorite_show = <STDIN>;
    chomp $favorite_show;
    
    print 'You are a fan of ', $favorite_band, ', ', $favorite_movie, ', and ', $favorite_show, q{.}, "\n";
    print '(I like Rush, Blade Runner, and Star Trek!)', "\n";

One possible run of this example produces:

    Please input your favorite band or musician: Neil Peart
    Please input your favorite movie: Blade Runner 2
    Please input your favorite TV show: TNG
    You are a fan of Neil Peart, Blade Runner 2, and TNG.
    (I like Rush, Blade Runner, and Star Trek!)

Section 2.9: Program Control Using The while Loop

Most programs will contain one or more operations which need to be repeated a certain number of times, and for these cases we use a "loop statement" or just a "loop" for short. Like a conditional statement, a loop statement is a kind of control structure, because they are both used to control the execution flow of your software.

In this section, we will describe the most basic loop in Perl, which is the while loop statement. Just like an if conditional statement, a while loop has both a header containing only a condition, and a body of operations. There is only one difference between if and while statements: if executes once when the condition is true, and while executes repeatedly as long as the condition is true.

The following loop will never stop running, because it has a condition which is hard-coded to numeric literal 1, which has a truth value of true. This is called an "infinite loop", and it will repeatedly print an unending amount of global greetings until the user sends a CTRL-C or similar termination signal:

    while (1) { print 'Hello, World!', "\n"; }

As with an if conditional statement, a while loop's condition may be any truth value, which is usually generated by a numeric comparison operator such as less-than < and greater-than > and equals ==, or a logical operator such as and and or and not. Also like a conditional statement, a loop's body may include as many operations as needed, on as many lines as needed.

Section 2.9.1: Loop Iterator Variables

Many loops use a designated variable called an "iterator", utilized for the sole purpose of keeping track of the current iteration count. Traditionally this variable is named $i, and its value is increased by one via the increment ++ operator called as the last operation in a loop body. The value of $i would then be tested via numeric comparison in the loop's condition:

    my integer $i = 0;
    while ( $i < 5 ) {
        print 'Hello, World #', $i, q{!}, "\n";
        $i++;
    }

In the example above, we first declare our iterator $i to start counting at a hard-coded numeric literal value of 0. All of computer science is based on binary logic with two possible binary values of 0 and 1; because of this and other reasons discussed in chapter 3, most counters and iterators start with 0 as their first value, 1 as their second value, and so forth. When $i starts at 0 and uses a less-than < operator to be compared to a certain maximum iteration value, then your while loop will iterate exactly that many times.

When we run the code above, we receive the following output:

    Hello, World #0!
    Hello, World #1!
    Hello, World #2!
    Hello, World #3!
    Hello, World #4!

If you really want to start $i at 1 instead of 0, then you have to use a less-than-or-equal <= operator to perform the same number of iterations:

    my integer $i = 1;
    while ( $i <= 5 ) {
        print 'Hello, World #', $i, q{!}, "\n";
        $i++;
    }

Now our output seems to make more sense to a human, but our iterator variable $i is no longer starting at 0:

    Hello, World #1!
    Hello, World #2!
    Hello, World #3!
    Hello, World #4!
    Hello, World #5!

In order to have our iterator $i start at 0, and also have our output make sense to a human, we call the addition + operator within parentheses as part of the print operator:

    my integer $i = 0;
    while ( $i < 5 ) {
        print 'Hello, World #', ( $i + 1 ), q{!}, "\n";
        $i++;
    }

Now we receive the desired output with $i starting at 0:

    Hello, World #1!
    Hello, World #2!
    Hello, World #3!
    Hello, World #4!
    Hello, World #5!

Section 2.9.2: Nested Loops

As with the if conditional statements, multiple while loops may be nested within one another. When you want to have iterator variables for 2 or even 3 nested loops, the traditional naming convention for simple cases is to start with iterator $i for the outer-most loop, then iterator $j for the next loop nested within the $i loop, and then iterator $k for the loop nested within the $j loop. If you need more or different iterators, you may choose any variable names which you find appropriate.

Also as with if conditional statements, when a variable is declared inside the body of a while loop statement, then said variable may only be utilized from within that same loop body (or any nested bodies within it). In the previous example, the integer variable $i is declared before the loop, so it may be accessed or modified outside the loop. In the next example, the variable $j is declared within the outer loop, so $j may not be utilized outside the outer loop.

Let's take a look at the simplest example of 3 nested loops:

    my integer $i = 0;
    while ( $i < 5 ) {
        my integer $j = 0;
        while ( $j < 4 ) {
            my integer $k = 0;
            while ( $k < 3 ) {
                print 'Hello, World #(', ( $i + 1 ), ', ', ( $j + 1 ), ', ', ( $k + 1 ), q{)!}, "\n";
                $k++;
            }
            $j++;
        }
        $i++;
    }

In each line of the generated output, the first number is from the outer-most $i loop, the second number is from the middle $j loop, and the last number is from the inner-most $k loop:

    Hello, World #(1, 1, 1)!
    Hello, World #(1, 1, 2)!
    Hello, World #(1, 2, 1)!
    Hello, World #(1, 2, 2)!
    Hello, World #(1, 3, 1)!
    Hello, World #(1, 3, 2)!
    Hello, World #(2, 1, 1)!
    Hello, World #(2, 1, 2)!
    Hello, World #(2, 2, 1)!
    Hello, World #(2, 2, 2)!
    Hello, World #(2, 3, 1)!
    Hello, World #(2, 3, 2)!
    Hello, World #(3, 1, 1)!
    Hello, World #(3, 1, 2)!
    Hello, World #(3, 2, 1)!
    Hello, World #(3, 2, 2)!
    Hello, World #(3, 3, 1)!
    Hello, World #(3, 3, 2)!
    Hello, World #(4, 1, 1)!
    Hello, World #(4, 1, 2)!
    Hello, World #(4, 2, 1)!
    Hello, World #(4, 2, 2)!
    Hello, World #(4, 3, 1)!
    Hello, World #(4, 3, 2)!

Section 2.9.3: Loop Control Operators next & last

Sometimes you need to skip to the next loop iteration immediately, without finishing the current iteration. For this, we use the next operator:

    my integer $i = 0;
    
    print 'before    loop, have current $i = ', $i, "\n";
    
    while ( $i < 10 ) {
        print "\n";
        print 'top    of loop, have current $i = ', $i, "\n";
        $i++;
        if ( $i % 2 ) { next; }
        print 'bottom of loop, have next    $i = ', $i, "\n";
    }
    
    print "\n";
    print 'after     loop, have current $i = ', $i, "\n";

As you can see, we can freely mix if conditional statements with while loop statements. When we run the code above, the print operator at the bottom of the loop's body is reached only when the incremented $i is even, because odd values of $i will cause the modulo % operator to evaluate as a truth value of true, which then calls the next operator:

    before    loop, have current $i = 0
    
    top    of loop, have current $i = 0
    
    top    of loop, have current $i = 1
    bottom of loop, have next    $i = 2
    
    top    of loop, have current $i = 2
    
    top    of loop, have current $i = 3
    bottom of loop, have next    $i = 4
    
    top    of loop, have current $i = 4
    
    top    of loop, have current $i = 5
    bottom of loop, have next    $i = 6
    
    top    of loop, have current $i = 6
    
    top    of loop, have current $i = 7
    bottom of loop, have next    $i = 8
    
    top    of loop, have current $i = 8
    
    top    of loop, have current $i = 9
    bottom of loop, have next    $i = 10
    
    after     loop, have current $i = 10

Sometimes you need to exit a loop immediately, without finishing the current iteration; for this, we use the last operator. Let's take the previous example and simply replace the next operator with last instead:

    my integer $i = 0;
    
    print 'before    loop, have current $i = ', $i, "\n";
    
    while ( $i < 10 ) {
        print "\n";
        print 'top    of loop, have current $i = ', $i, "\n";
        $i++;
        if ( $i % 2 ) { last; }
        print 'bottom of loop, have next    $i = ', $i, "\n";
    }
    
    print "\n";
    print 'after     loop, have current $i = ', $i, "\n";

When we run the code above, the print operator at the bottom of the loop's body is never reached and the loop only executes for one (partial) iteration, because the very first incremented value of $i is odd and will cause the modulo % operator to evaluate as true, which then immediately calls the last operator:

    before    loop, have current $i = 0
    
    top    of loop, have current $i = 0
    
    after     loop, have current $i = 1

Section 2.9.4: Loop Labels & Loop Control Operator redo

Sometimes you need to restart a loop's current iteration, without actually finishing the current iteration first or re-executing the loop's header; for this, we use the redo operator. However, before we use the redo operator, we must first learn about loop labels.

When using nested loops, it can be confusing to keep track of which loop is being affected by which loop control operator. To solve this issue, we use loop labels, which are simply all-uppercase names prepended to the loop keyword, which is while in this case. The use of loop labels is optional in most cases, but is required when you want to reliably use loop control operators inside nested loops, and whenever you want to use the redo operator at all.

Let's take the previous example again, add the loop label MY_LOOP, and simply replace the last operator with redo MY_LOOP instead:

    my integer $i = 0;
    
    print 'before    loop, have current $i = ', $i, "\n";
    
    MY_LOOP: while ( $i < 10 ) {
        print "\n";
        print 'top    of loop, have current $i = ', $i, "\n";
        $i++;
        if ( $i % 2 ) { redo MY_LOOP; }
        print 'bottom of loop, have next    $i = ', $i, "\n";
    }
    
    print "\n";
    print 'after     loop, have current $i = ', $i, "\n";

When we run the code above, we see the redo operator provides the exact same behavior as next in this case, but that only applies to some while loops and will not always be true. (In the next chapter we will learn about another kind of loop, where next and redo never behave the same.)

    before    loop, have current $i = 0
    
    top    of loop, have current $i = 0
    
    top    of loop, have current $i = 1
    bottom of loop, have next    $i = 2
    
    top    of loop, have current $i = 2
    
    top    of loop, have current $i = 3
    bottom of loop, have next    $i = 4
    
    top    of loop, have current $i = 4
    
    top    of loop, have current $i = 5
    bottom of loop, have next    $i = 6
    
    top    of loop, have current $i = 6
    
    top    of loop, have current $i = 7
    bottom of loop, have next    $i = 8
    
    top    of loop, have current $i = 8
    
    top    of loop, have current $i = 9
    bottom of loop, have next    $i = 10
    
    after     loop, have current $i = 10

The redo operator does not re-execute its corresponding loop's header, it jumps directly to the first operation in the loop's body and continues execution from there. In contrast, the next operator does execute the loop's header, which in the case of while is comprised only of checking the loop's condition.

The following source code example is comprised of a while loop containing an if conditional with a "bail out" condition:

    my integer $i = 0;
    
    print 'before    loop, have current $i = ', $i, "\n\n";
    
    MY_LOOP: while ( $i < 5 ) {
        print 'top    of loop, have current $i = ', $i, "\n";
        $i++;
        if ( $i > 10 ) {
            print 'inside of loop, have next    $i = ', $i, '; value too big, bailing out!', "\n";
            last MY_LOOP;
        }
        next MY_LOOP;
    }
    
    print "\n";
    print 'after     loop, have current $i = ', $i, "\n";

When we run the example code above, we never reach the last operator inside the loop, because the next operator causes the loop's condition to be checked and the loop stops iterating before $i reaches a value greater-than 10. In fact, we could totally delete the next MY_LOOP line as well as the entire if conditional statement, and we would still receive the same output:

    before    loop, have current $i = 0
    
    top    of loop, have current $i = 0
    top    of loop, have current $i = 1
    top    of loop, have current $i = 2
    top    of loop, have current $i = 3
    top    of loop, have current $i = 4
    
    after     loop, have current $i = 5

In the following code, we have simply replaced next MY_LOOP with last MY_LOOP:

    my integer $i = 0;
    
    print 'before    loop, have current $i = ', $i, "\n\n";
    
    MY_LOOP: while ( $i < 5 ) {
        print 'top    of loop, have current $i = ', $i, "\n";
        $i++;
        if ( $i > 10 ) {
            print 'inside of loop, have next    $i = ', $i, '; value too big, bailing out!', "\n";
            last MY_LOOP;
        }
        last MY_LOOP;
    }
    
    print "\n";
    print 'after     loop, have current $i = ', $i, "\n";

As before, the body of the if conditional statement is never executed because $i never reaches a value higher than 10. Perhaps unsurprisingly, running this code produces a relatively short output:

    before    loop, have current $i = 0
    
    top    of loop, have current $i = 0
    
    after     loop, have current $i = 1

Of course, we will now replace next MY_LOOP with redo MY_LOOP:

    my integer $i = 0;
    
    print 'before    loop, have current $i = ', $i, "\n\n";
    
    MY_LOOP: while ( $i < 5 ) {
        print 'top    of loop, have current $i = ', $i, "\n";
        $i++;
        if ( $i > 10 ) {
            print 'inside of loop, have next    $i = ', $i, '; value too big, bailing out!', "\n";
            last MY_LOOP;
        }
        redo MY_LOOP;
    }
    
    print "\n";
    print 'after     loop, have current $i = ', $i, "\n";

The redo operator does not cause the while loop's header to be executed, so the $i < 5 condition is only checked the very first iteration. Therefore, the value of $i continues to increase until it is greater-than 10 and triggers the "bail out" mechanism. (If only the economy was this simple.)

When we run the code, this is the output we see displayed:

    before    loop, have current $i = 0
    
    top    of loop, have current $i = 0
    top    of loop, have current $i = 1
    top    of loop, have current $i = 2
    top    of loop, have current $i = 3
    top    of loop, have current $i = 4
    top    of loop, have current $i = 5
    top    of loop, have current $i = 6
    top    of loop, have current $i = 7
    top    of loop, have current $i = 8
    top    of loop, have current $i = 9
    top    of loop, have current $i = 10
    inside of loop, have next    $i = 11; value too big, bailing out!
    
    after     loop, have current $i = 11

Now we will look at at a simple nested loop example; we have 2 loops with the last operator called from the body of the inner loop:

    my integer $i = 0;
    my integer $j = 0;
    
    print 'before    outer loop, have current $i = ', $i, "\n";
    print 'before    outer loop, have current $j = ', $j, "\n";
    
    while ( $i < 3 ) {
        print "\n";
        print 'top    of outer loop, have current $i = ', $i, "\n";
        $i++;
        $j = 0;
        while ( $j < 5 ) {
            print 'top    of inner loop, have current $j = ', $j, "\n";
            $j++;
            if ( $j > 2 ) { next; }
            print 'bottom of inner loop, have current $j = ', $j, "\n";
        }
        print 'bottom of outer loop, have next    $i = ', $i, "\n";
    }
    
    print "\n";
    print 'after     outer loop, have current $i = ', $i, "\n";
    print 'after     outer loop, have current $j = ', $j, "\n";

When we run this code we can see that for every iteration of the outer loop, the bottom of the inner loop is only reached twice, when the current value of $j is less-than 2. This is because when the next operator is called without a loop label, then it is applied to most immediate loop, which means next is applied to the inner loop in this case:

    before    outer loop, have current $i = 0
    before    outer loop, have current $j = 0
    
    top    of outer loop, have current $i = 0
    top    of inner loop, have current $j = 0
    bottom of inner loop, have current $j = 1
    top    of inner loop, have current $j = 1
    bottom of inner loop, have current $j = 2
    top    of inner loop, have current $j = 2
    top    of inner loop, have current $j = 3
    top    of inner loop, have current $j = 4
    bottom of outer loop, have next    $i = 1
    
    top    of outer loop, have current $i = 1
    top    of inner loop, have current $j = 0
    bottom of inner loop, have current $j = 1
    top    of inner loop, have current $j = 1
    bottom of inner loop, have current $j = 2
    top    of inner loop, have current $j = 2
    top    of inner loop, have current $j = 3
    top    of inner loop, have current $j = 4
    bottom of outer loop, have next    $i = 2
    
    top    of outer loop, have current $i = 2
    top    of inner loop, have current $j = 0
    bottom of inner loop, have current $j = 1
    top    of inner loop, have current $j = 1
    bottom of inner loop, have current $j = 2
    top    of inner loop, have current $j = 2
    top    of inner loop, have current $j = 3
    top    of inner loop, have current $j = 4
    bottom of outer loop, have next    $i = 3
    
    after     outer loop, have current $i = 3
    after     outer loop, have current $j = 5

Now let's take the same nested loops, add loop labels, and replace next with next OUTER_LOOP instead:

    my integer $i = 0;
    my integer $j = 0;
    
    print 'before    outer loop, have current $i = ', $i, "\n";
    print 'before    outer loop, have current $j = ', $j, "\n";
    
    OUTER_LOOP: while ( $i < 3 ) {
        print "\n";
        print 'top    of outer loop, have current $i = ', $i, "\n";
        $i++;
        $j = 0;
        INNER_LOOP: while ( $j < 5 ) {
            print 'top    of inner loop, have current $j = ', $j, "\n";
            $j++;
            if ( $j > 2 ) { next OUTER_LOOP; }
            print 'bottom of inner loop, have current $j = ', $j, "\n";
        }
        print 'bottom of outer loop, have next    $i = ', $i, "\n";
    }
    
    print "\n";
    print 'after     outer loop, have current $i = ', $i, "\n";
    print 'after     outer loop, have current $j = ', $j, "\n";

This time we can see the bottom of the inner loop is only reached twice, which is the same as before. However, now the bottom of the outer loop is never reached, because the next operator is being applied to the outer loop instead of the inner loop:

    before    outer loop, have current $i = 0
    before    outer loop, have current $j = 0
    
    top    of outer loop, have current $i = 0
    top    of inner loop, have current $j = 0
    bottom of inner loop, have current $j = 1
    top    of inner loop, have current $j = 1
    bottom of inner loop, have current $j = 2
    top    of inner loop, have current $j = 2
    
    top    of outer loop, have current $i = 1
    top    of inner loop, have current $j = 0
    bottom of inner loop, have current $j = 1
    top    of inner loop, have current $j = 1
    bottom of inner loop, have current $j = 2
    top    of inner loop, have current $j = 2
    
    top    of outer loop, have current $i = 2
    top    of inner loop, have current $j = 0
    bottom of inner loop, have current $j = 1
    top    of inner loop, have current $j = 1
    bottom of inner loop, have current $j = 2
    top    of inner loop, have current $j = 2
    
    after     outer loop, have current $i = 3
    after     outer loop, have current $j = 3

If we replace next OUTER_LOOP with next INNER_LOOP, then we will see the same default behavior as when next is called with no loop label.

What do you think will happen if we place next with last or redo in these example cases? I will leave it up to you to experiment and find out for yourself! (Hint: redo INNER_LOOP and redo OUTER_LOOP will cause two different kinds of infinite loop.)

Section 2.9.5: Combining while & STDIN

Sometimes you want to accept multiple lines of user input via STDIN, instead of just one line at a time as in previous examples. To achieve this, we put the call to STDIN inside the loop's condition:

    #!/usr/bin/env perl
    
    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;
    
    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt
    
    # [[[ OPERATIONS ]]]
    
    my string $input_strings = q{};
    
    print 'Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D> on a blank line:', "\n";
    
    while ( my string $input_string = <STDIN> ) {
        $input_strings .= $input_string;
    }
    
    print "\n";
    print 'after loop, have $input_strings = ', "\n", $input_strings, "\n";

As displayed by the first print operator above, the users must press the two-keystroke combination CTRL-D in order to signal the "EOF" or "End Of File" condition, which tells the while loop to stop iterating. Also, we must remember to include the "USER DEFAULT 4" no critic command, thereby enabling use of STDIN.

When we run the code above, we can see all user input has been combined into one variable $input_strings via the string concatenation . operator:

    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D> on a blank line:
    howdy
    dowdy
    doo
    
    after loop, have $input_strings = 
    howdy
    dowdy
    doo

Section 2.10: Exercises

1. Constant Pi & Calculated Circumference Of A Circle [ 30 mins ]

In the same LearningRPerl directory from the chapter 1 exercises, create a new sub-directory named Chapter2. In the new sub-directory, create an RPerl program file named exercise_1-circumference_of_specific_radius.pl, which contains constant data for the approximate value of pi equal to 3.141_592_654. In your program, utilize the print operator and RPerl's data type conversion subroutines to display your hard-coded value of pi.

Next, create a variable named $radius with a value of 12.5, and a second variable named $circumference which contains the properly-calculated circumference for a circle with the corresponding radius value. Use print and the type conversion subroutines to display all your values:

    Pi = 3.141_592_654
    Radius = 12.5
    Circumference = 2 * Pi * Radius = 2 * 3.141_592_654 * 12.5 = 78.539_816_35

Run your new program by issuing the following command at your terminal command prompt:

    $ rperl -t LearningRPerl/Chapter2/exercise_1-circumference_of_specific_radius.pl

HINT: You will need two no critic commands, and your program should be at least 13 lines long.

2. Variable Radius & Calculated Circumference Of A Circle [ 20 mins ]

Make a copy of your fully-functioning program from exercise 1 above, thereby creating a new RPerl program file named exercise_2-circumference_of_any_radius.pl. (Don't just edit your previous program file without making a copy, or you will not receive full credit.)

Use print to prompt the user to enter a custom value for $radius, then use the STDIN keyword to receive one input value into a new variable named $radius_string of string data type. Next, use an RPerl data type conversion subroutine to convert the string data type and store it in the original $radius variable of number data type.

As in the previous exercise, end by using the print operator to display all your values:

    $ rperl -t LearningRPerl/Chapter2/exercise_2-circumference_of_any_radius.pl
    Please input radius: 3

    Pi = 3.141_592_654
    Radius = 3
    Circumference = 2 * Pi * Radius = 2 * 3.141_592_654 * 3 = 18.849_555_924

HINT: You will need to add a third no critic command, and a total of at least 3 or 4 extra lines of code.

3. Conditional Error Checking & Calculated Circumference Of A Circle [ 20 mins ]

Make a copy of your completed program from exercise 2 above, and name your new file exercise_3-circumference_of_any_positive_radius.pl.

Using an if and else conditional statement, detect when the user provides a negative number for the radius variable, in which case a warning should be displayed and the circumference should be set to a value of zero:

    $ rperl -t LearningRPerl/Chapter2/exercise_3-circumference_of_any_positive_radius.pl
    Please input radius: -3
    Negative radius detected, defaulting to zero circumference!

    Pi = 3.141_592_654
    Radius = -3
    Circumference = 2 * Pi * Radius = 2 * 3.141_592_654 * -3 = 0

4. Product Of Two Numbers [ 20 mins ]

Write a new RPerl program named exercise_4-product_of_any_two_numbers.pl, which will prompt the user for two input numbers, one at a time, and then display the result of multiplying the two input numbers together. As with all keyboard input in this and any other RPerl source code, your input values are text data and should be passed to the RPerl data type conversion subroutines before being used in numeric calculations. Remember to reverse this by converting back from number to string types again, before passing a numeric value to the print operator.

An example execution of this exercise should display:

    $ rperl -t LearningRPerl/Chapter2/exercise_4-product_of_any_two_numbers.pl
    Please input multiplicator: 17
    Please input multiplicand: 123.456_789

    Product = Multiplicator * Multiplicand = 17 * 123.456_789 = 2_098.765_413

5. String Repetition [ 15 mins ]

Create a new program named exercise_5-string_repeat.pl, which will prompt the user for one input string followed by one input integer. Your program should then use the string repetition operator x to display the input string repeated as many times as the input integer specifies. As always, use the RPerl data type conversion subroutines when appropriate.

An example execution should display:

    $ rperl -t LearningRPerl/Chapter2/exercise_5-string_repeat.pl
    Please input string to be repeated: Repetition is an aid to memory.
    Please input integer (whole number) times to repeat string: 5

    Repetition is an aid to memory.
    Repetition is an aid to memory.
    Repetition is an aid to memory.
    Repetition is an aid to memory.
    Repetition is an aid to memory.

6. Looping Integer Sum [ 35 mins ]

Create a program named exercise_6-sum_of_first_n_integers.pl, which will prompt the user for one input integer and then display the sum of the numbers 1 through the input integer, inclusive. Utilize an if conditional statement to perform an error check, and call the die operator if the input integer is less-than 0. Then utilize a while loop to compute the actual sum to be displayed as a result.

Store your input integer in a variable named $n, and your final result in a variable named $sum, both of the integer data type. As with all loop iterators, the $i variable is also an integer.

Two possible executions of this exercise follow:

    $ rperl -t LearningRPerl/Chapter2/exercise_6-sum_of_first_n_integers.pl
    Please input a positive integer: 42
    The sum of the first 42 integers is 903

    $ rperl -t LearningRPerl/Chapter2/exercise_6-sum_of_first_n_integers.pl
    Please input a positive integer: -42
    ERROR: -42 is not positive, dying

7. Non-Looping Integer Sum [ 45 mins ]

Make a copy of your completed program from the previous exercise, and save it into a new file named exercise_7-sum_of_first_n_integers_no_loop.pl.

Modify the new program to remove the while loop, and instead replace it with the famous method credited to Gauss, the "Prince of Mathematics":

It is said that as a child, Gauss was punished by his teacher by being told to mentally add together all the numbers from 1 to 100. Young Gauss realized that 1 plus 100 is 101, and 2 plus 99 is also 101, as well as 3 plus 98 and so forth. All the numbers may be paired in this manner to equal 101, and there are 50 such pairs from 1 to 100 inclusively, so the final answer is 101 multiplied by 50. Thus, with a bit more thought Gauss was able to quickly calculate the correct answer of 5,050 and perplex his teacher in the process.

Use young Gauss' algorithm to optimize the runtime performance (AKA speed) of your new program, which you can achieve by replacing the while loop with arithmetic operators as described in the story above. This exercise should always give the same answer as the previous exercise, but Gauss' algorithm is naturally more efficient for large input integers, because it requires far fewer calculations than the repeated addition loop algorithm.

Start by assuming your input integer will be evenly divisible by two, so you can use the exact algorithm used by Gauss in the story. Once your program works correctly for all even integer inputs, then use an if conditional statement to check for odd integer inputs and extend your algorithm to handle those cases. Of course, the error checking for negative input integers should be left in place from the previous exercise.

Two possible executions of this exercise follow, one with an even input integer, and one with an odd input integer:

    $ rperl -t LearningRPerl/Chapter2/exercise_7-sum_of_first_n_integers_no_loop.pl
    Please input a positive integer: 64
    The sum of the first 64 integers = 2_080

    $ rperl -t LearningRPerl/Chapter2/exercise_7-sum_of_first_n_integers_no_loop.pl
    Please input a positive integer: 65
    The sum of the first 65 integers = 2_145

HINT: Use the modulo % operator to check for odd input integers; if found, initialize a temporary variable named $n_odd to the value of $n, decrement $n to become even, and add $n_odd to your original $sum result.


CHAPTER 3: ARRAY VALUES & VARIABLES

In the previous chapter, we explored the many uses of scalar values and variables, which dealt with one number or one string at a time. In this chapter, we will similarly explore the uses of array values and variables, which are collections of multiple scalars (numbers or strings) grouped together.

A scalar represents exactly one piece of Perl data and represents one specific data type. An array represents zero or more pieces of Perl data and represents a compound "data structure", defined as a combination of two or more subcomponents which may themselves be either normal scalar data types or additional compound data structures. In an array, each individual subcomponent is known as an "element".

In normal Perl, a variable may contain (represent) an array directly "by value", or indirectly "by reference". As an analogy, imagine that your computer's memory is like a filing cabinet, your Perl program's memory is like one drawer in the filing cabinet, your array data structure is like one folder in the filing cabinet drawer, your variable is like the label on the folder, and each element of the array is like one page in the folder.

In this filing cabinet analogy, an array which is stored by value is like a normal label (array variable), on a normal folder (array data structure), containing normal pages (elements). An array which is stored by reference is like a special label (scalar variable), affixed to a single piece of paper (scalar data type), upon which is written the location (memory address AKA reference) of a special folder (anonymous array data structure). Inside the special folder are normal pages (elements), and there is no label (array variable) affixed to the special folder, thus we say the folder is an "anonymous array" because it has no label.

To reiterate, an array stored by reference is actually a scalar variable which contains the memory address of an anonymous array. When compared to the memory usage of storing an array by data, the memory needs of storing by reference only require one additional scalar value (one piece of paper in our filing cabinet analogy), which contains the anonymous array's memory address. Both storing by value and storing by reference are commonly utilized in many different computer programming languages.

When an array is stored by value, the corresponding variable has an at-sign @ as its sigil, and the assigned data is enclosed within parentheses ( ) characters. When an array is stored by reference, the corresponding variable has a dollar-sign $ as its sigil, and the assigned data is enclosed within "square-bracket" [ ] characters, also known as just "brackets". If an array is stored by value and is then passed as input to an operation, it is said the array is "passed by value". On the other hand, if the array is instead stored by reference and then passed to an operation, it is said to be "passed by reference".

For arrays with more than just a few elements, it may be impractical or impossible to pass by value, because a full copy of each array element must be made in the process, which may fill up all your program's available memory or take a prohibitively long time to complete. Also, Perl allows us to provide explicit data types only when an array is stored by reference, so we can not provide a data type for an array stored by value. Because of these reasons, all RPerl arrays are stored by reference, and are declared with an explicit RPerl data type ending with _arrayref.

    my                  @foo_by_value           = (2, 4, 6);  # fine in normal Perl, error in RPerl
    my                  $foo_by_reference       = [2, 4, 6];  # fine in normal Perl, error in RPerl
    my integer_arrayref $foo_by_reference_typed = [2, 4, 6];  # fine in normal Perl, fine  in RPerl

In a few special cases, Perl forces us to provide an array by value instead of by reference, in which case we need to "dereference" our array variable, which is the process of converting from the stored-by-reference memory address to the stored-by-data values. This is achieved by use of Perl's closefix array dereference syntax, comprised of enclosing the scalar array variable within at-sign-curly-braces @{ }. Because all arrays in RPerl are stored by reference, only necessary uses of the dereference syntax are supported by the RPerl compiler. (Please see "Section 3.8: push & pop Operators" for more information on pop.)

    my integer_arrayref $foo_by_reference_typed = [10, 20, 30];                    # fine in normal Perl, fine  in RPerl
    my integer          $foo_last_element       = pop @{$foo_by_reference_typed};  # fine in normal Perl, fine  in RPerl,   necessary dereference 
    my                  @foo_by_value           = @{$foo_by_reference_typed};      # fine in normal Perl, error in RPerl, unnecessary dereference

In normal Perl, a single array may contain elements with multiple different data types, such as a three-element array containing one string and one integer and one floating-point number. In RPerl, a single array must contain elements of all the same data type, so you can have a three-element array with all strings, but you can't store an integer inside an array of strings, and likewise you can't store a string inside an array of integers, etc.

    my integer_arrayref $foo = [5, 10, 15];                 # fine
    my  number_arrayref $bar = [5, 10, 15.5];               # fine
    my  string_arrayref $bat = ['five', 'ten', 'fifteen'];  # fine

    my integer_arrayref $foo = [5, 10, 15.5];               # error in RPerl, compiled modes
    my  number_arrayref $bar = [5, 10, 'fifteen'];          # error in RPerl, compiled modes
    my  string_arrayref $bat = ['five', 'ten', 15];         # error in RPerl, compiled modes

You can create an array with only one element, sometimes called a "singleton"; you can even create an empty array with zero elements:

    my integer_arrayref $foo = [];    # zero elements, empty     array
    my integer_arrayref $bar = [23];  # one  element,  singleton array

To display the contents of an array, you may utilize the *_arrayref_to_string() family of stringification subroutines:

    my integer_arrayref $foo;
    $foo = [23, 42, 2_112];
    print '$foo = ', integer_arrayref_to_string($foo), "\n";

Running the code example above generates the following output:

    $foo = [23, 42, 2_112]

RPerl currently supports the following array stringification subroutines, with support for all RPerl array data types coming soon:

Section 3.1: Lists vs Arrays

In Perl, we use the comma , character to separate elements in a "list", which may then be utilized either as the operands passed as input to an appropriate operation, or as the values stored inside an array data structure. Thus, an array variable may be assigned the value of an enclosed list, but in RPerl the reverse is not true. (Please see "Section 3.7: Array Assignment" for more information.)

    my                 @variable_storing_array_by_value     = ('list', 'enclosed', 'within', 'round',  'parentheses');  # fine in Perl, error in RPerl, list assigned to stored-by-value     array
    my string_arrayref $variable_storing_array_by_reference = ['list', 'enclosed', 'within', 'square', 'brackets'];     # fine in Perl, fine  in RPerl, list assigned to stored-by-reference array

    my ($list, $of, $variables) = @variable_storing_array_by_value;      # fine  in Perl, error in RPerl, stored-by-value     array assigned to list
    my [$list, $of, $variables] = $variable_storing_array_by_reference;  # error in Perl, error in RPerl, stored-by-reference array assigned to list

As seen in the code example above, when a list is enclosed within square-brackets [ ], then we have a stored-by-reference "array literal", which represents the literal value which may be assigned to an RPerl variable. Likewise, when a list is enclosed within parentheses ( ) characters and the context tells Perl the parentheses are used as array syntax, then we have a stored-by-data array literal, which may be assigned to a normal Perl variable.

Depending upon the context of the surrounding source code, the parentheses ( ) characters have multiple different uses in Perl, including list and array literal values, explicit order-of-operations, operation input arguments, and control structure statement headers. (Remember that RPerl only supports arrays which are stored by reference, so all RPerl array literals will be enclosed in square-brackets [ ], not parentheses ( ).)

    my                 @foo = (1,  3,   5);  # fine in Perl, error in RPerl, parentheses enclosing list values to create stored-by-value array
    my integer         $foo = (1 + 3) * 5;   # fine in Perl, fine  in RPerl, parentheses enclosing arithmetic order-of-operations
    my string_arrayref $foo = [qw(a c e)];   # fine in Perl, fine  in RPerl, parentheses enclosing input arguments to qw() operator
    if (1 < 3)         { print   'a c e'; }  # fine in Perl, fine  in RPerl, parentheses enclosing if conditional statement header

Terminology can be confusing at times. An array literal should not be confused with the similar-sounding "array of literals", which is an array where all the elements are literals instead of variables or operations. In the source code example below, the data structure [$foo, $bar, 23] is an array literal, even though two of its three elements are variables themselves; remember, an array literal is any enclosed list which can be assigned to a variable. On the other hand, the data structure [22, 13, 24] below is an array of literals, because all its elements are literal values themselves. At the same time, the variable $frob can also be considered an array of literals, due to the ambiguity of the term "array" when used without explicit context.

    my integer          $foo  = 21;
    my integer          $bar  = 12;
    my integer_arrayref $quux = [$foo, $bar, 23];  # array literal     on right side  of = assignment operator
    my integer_arrayref $frob = [22,   13,   24];  # array of literals on both  sides of = assignment operator

Depending on context, the unqualified term "array" can be taken to have either the general meaning of "array data structure category", or the specific meanings "array literal" or "array assigned to variable". The phrase "scalar versus array" should be read as "scalar data type category versus array data structure category", because we are referring to the general ideas of "scalar" and "array" instead of specific array literals or variables. The phrase "array [22, 13, 24]" should be read as "array literal [22, 13, 24]", because we can see an enclosed list of comma-separated values. The phrase "array $frob" should be read as "array assigned to variable $frob", because we can see a dollar-sign $ sigil variable name. (This is just one of many possible confusion points caused by the linguistic subtleties inherent in our particular brand of technical terminology, often rightfully referred to as "programming jargon" or even the mocking "techno-babble".) Because the stand-alone term "array" is ambiguous, the term "array of literals" can be taken to mean either "array-literal of literals" or "array-assigned-to-variable of literals". This means we can consider both $frob and [22, 13, 24] to be an "array of literals" in the source code example above. Hopefully, you will never again experience this specific confusion of jargon, but you must always be wary of subtle language issues when dealing with high-tech concepts such as programming in RPerl.

Section 3.2: 1-D Array Data Types & Constants

In RPerl, all arrays are stored by reference and have a specific data type which ends with _arrayref. In the preceeding examples, you have already seen the three most common RPerl array data types, which are integer_arrayref, number_arrayref, and string_arrayref.

A single scalar may be considered to be a zero-dimensional (0-D) data point. When an array's individual elements are each scalars, the array is considered to be one-dimensional (1-D), and may be visualized as multiple data points displayed on a single line. Some additional math and software concepts related to a 1-D array or list are "vector", "sequence", and "set"; please see Wikipedia for more information. (For example, in the MathPerl software suite, which is part of the RPerl Family of software, the Vector data structure is simply a specially-packaged number_arrayref.) In some cases, it may also be useful to visualize a 1-D array as a single "row" of elements.

    my integer_arrayref $row_1D = [0, 2, 4, 6, 8];  # one row on one line

The following 1-D array data types may be utilized in RPerl:

Perl itself does not currently support array data structures with constant values, so only scalars may be constants.

    use constant PIE  => my string $TYPED_PIE = 'pecan';  # fine in Perl & RPerl

    use constant DAYS => my string_arrayref $TYPED_DAYS = [ 'Sun', 'Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat' ];  # error in Perl & RPerl

As mentioned in "Section 2.4: Variables With Scalar Values", future versions of RPerl will likely provide the scalar data type and also the scalar_arrayref data structure, which will provide the ability to use one array reference to store elements with non-matching data types. Currently, RPerl arrays must store elements which all have the exact same data type as one another.

Please see "Section 3.5: 2-D Array Data Types & Nested Arrays" for more RPerl array data types.

Section 3.3: How To Access Array Elements

Please take a few moments to review "Section 2.9.1: Loop Iterator Variables" at this time.

Closely related to the loop concept of an iterator variable is that of an "array index". Each element of a Perl array is numbered, in order, by integers starting at the value of zero and counting upward. Each such integer is known as an array index, or just "index" for short.

(Depending upon context or tradition, you may argue the "i" in $i stands for either "iterator" or "index". The main difference between an iterator and an index are usage: an iterator is an integer variable used to count loop iterations; whereas an index is a integer value used to specify an array element, and may be either a variable or a hard-coded numeric literal. An iterator can always be used as an index, and often is; however, the reverse is not true.)

What a normal person would think of as being the "first" element in an array is actually not at index value of 1, but is actually at index 0 instead. This is a common point of confusion for new programmers, or even experienced programmers who may be accustomed to a different computer language which starts at index 1 instead of index 0 like Perl. It may help to think of the "first" element as actually being the "zeroth" element.

    my string_arrayref $marx_brothers = ['Chico', 'Harpo', 'Groucho', 'Gummo', 'Zeppo'];
    print 'The first born is ',   $marx_brothers->[0], "\n";
    print 'The middle child is ', $marx_brothers->[2], "\n";
    The first born is Chico
    The middle child is Groucho

In the example above, note the "first born" is located at index value of 0 (not 1), and the "middle child" is at index 2 (not 3).

Also, note the thin-arrow-square-brackets ->[ ] syntax for accessing the individual elements. The thin-arrow -> is Perl's "postfix dereference" operation, which fetches the array data pointed to by the array's memory address, and is necessary because all RPerl arrays are stored by reference. When combined with the postfix dereference operation, the square-brackets [ ] return the actual element located at the specified index value. In the following example, each of the three lines of source code achieves the same goal of returning the array element at index 2, although only the last line is valid in RPerl:

    $my_element = @my_array[2];        # fine in Perl, error in RPerl, array not stored by reference
    $my_element = @{$my_arrayref}[2];  # fine in Perl, error in RPerl, unnecessary use of @{} closefix dereference syntax
    $my_element = $my_arrayref->[2];   # fine in Perl, fine  in RPerl,   necessary use of ->   postfix dereference syntax

Section 3.4: Array Length & Negative Indices

When we want to count how many elements are in an array, we need to find the array's length. This is achieved by use of Perl's scalar operator, combined with Perl's closed-fixity at-sign-curly-braces @{ } dereference operation. Both the dereference and postfix dereference operations perform the same task, although they are used in differing scenarios due to their unique syntax. The scalar operator forces the dereferenced array to be evaluated in "scalar context", which means Perl tries to treat an array as if it were a scalar. Perl has many complex behaviors when forcing one data type's context upon another different data type, although in the case of forcing an array into scalar context we are simply provided with the length of the array.

    my string_arrayref $greetings        = ['hello', 'hi', 'howdy'];
    my integer         $greetings_length = scalar @{$greetings};
    print 'have $greetings_length = ', $greetings_length, "\n";
    have $greetings_length = 3

So, when you want to utilize an array's length to calculate a desired index value, remember that you must subtract one from the length value, because array indices start counting at zero but the array length starts counting at one. Yes, this is another common trap for new programmers, so watch yourself!

    my string_arrayref $greetings        = ['hello', 'hi', 'howdy'];
    my integer         $greetings_length = scalar @{$greetings};
    my string $greeting_final            = $greetings->[($greetings_length - 1)];
    print 'have $greeting_final = ', $greeting_final, "\n";
    have $greeting_final = howdy

Now let's see what would happen if we simply forgot to subtract one:

    my string_arrayref $greetings        = ['hello', 'hi', 'howdy'];
    my integer         $greetings_length = scalar @{$greetings};
    my string $greeting_final            = $greetings->[$greetings_length];
    print 'have $greeting_final = ', $greeting_final, "\n";
    Use of uninitialized value $greeting_final in print...
    have $greeting_final =

As you can see in the output above, an attempt to access the $greetings element at index value 3 results in an uninitialized value error message, because the highest valid index value is actually 2 in this case. Don't forget to subtract one when converting from array lengths to array indices!

In Perl, an array index with a negative value will access elements beginning from index -1 as the final element of the array, and counting backward to end with the starting (zeroth) element. Thus, we may rewrite the previous example to achieve the exact same result by utilizing a negative index instead of the scalar operation, thereby simplifying our code by removing at least three operations and one variable:

    my string_arrayref $greetings        = ['hello', 'hi', 'howdy'];
    my string $greeting_final            = $greetings->[-1];
    print 'have $greeting_final = ', $greeting_final, "\n";
    have $greeting_final = howdy

If you want to access each array element in reverse order, simply count downward with indices starting at -1:

    my string_arrayref $greetings        = ['hello', 'hi', 'howdy'];
    print 'have $greetings->[-1] = ', $greetings->[-1], "\n";
    print 'have $greetings->[-2] = ', $greetings->[-2], "\n";
    print 'have $greetings->[-3] = ', $greetings->[-3], "\n";
    have $greetings->[-1] = howdy
    have $greetings->[-2] = hi
    have $greetings->[-3] = hello

Section 3.5: 2-D Array Data Types & Nested Arrays

Because one data structure may contain another data structure as one of its subcomponents, we may thus nest multiple array values within one another. In normal Perl, any number of arrays may be arbitrarily nested within one another in any syntactically-valid combination; in RPerl, we require the use of explicit data types, which are currently limited to 2-dimensional array-within-array data structures. Support for 3-dimensional or other nested data structures will be added in a future version of RPerl.

    my integer_arrayref_arrayref $nested_2d =      # fine in normal Perl, fine  in RPerl, 2x3     2-D nested array
        [[0, 1, 2],
         [9, 8, 7]];
    my                           $nested_2d_mix =  # fine in normal Perl, error in RPerl, 2x3     2-D nested array, unmatching integer & string types
        [[0, 1, 2],
         ['a', 'b', 'c']];
    my                           $nested_3d =      # fine in normal Perl, error in RPerl, 3x3x3   3-D nested array, RPerl support coming soon
        [[[1, 2, 3], 
          [2, 3, 4], 
          [3, 4, 5]],
         [[9, 8, 7],
          [8, 7, 6],
          [7, 6, 5]],
         [[0, 2, 4],
          [2, 4, 6],
          [4, 6, 8]]];
    my                           $nested_4d =      # fine in normal Perl, error in RPerl, 1x1x1x3 4-D nested array, RPerl support coming soon
        [[[[23, 42, 2_112]]]];

When an array's individual elements are each arrays, and each of those arrays is comprised of scalars, then the primary array is considered to be 2-D. ("Or not 2-D? That is the question." ... Okay sorry for that one, haha!) A 2-D array may be visualized as multiple data points displayed on multiple lines. In mathematics, the concept of a "matrix" is essentially a 2-D array which usually has an equal number of elements in each row. Because of this, a matrix can often be visualized as either rectangular or square, and is said to contain both rows and "columns", where the number of columns is defined as the number of elements in each row. In the example below, the 2-D array can be utilized as a square matrix, because there are an equal number of rows (five) as there are columns (also five).

    my integer_arrayref_arrayref $rows_and_columns_2D =  # fine in RPerl, multiple rows and columns on multiple lines
        [[0, 2, 4, 6, 8],
         [1, 3, 5, 7, 9],
         [4, 3, 2, 1, 0],
         [9, 8, 7, 6, 5],
         [5, 5, 5, 5, 5]];

However, in RPerl we are not limited to only rectangular or square matrices, because we are not required to have the same number of elements in each row of a 2-D array. In the example below, note the acceptable use of rows with only one or zero elements; the empty array [ ] and the singleton array [5] are both valid in RPerl.

    my integer_arrayref_arrayref $rows_and_columns_2D_irregular =  # fine in RPerl, irregular row lengths
        [[0, 2, 4],
         [1, 3, 5, 7, 9],
         [],
         [9, 8, 7, 6],
         [5]];

The following 2-D array data types may be utilized in RPerl:

To access elements in a 2-D array, you must use two postfix dereference operations, which may either be combined into one statement or separated across multiple statements. When a 2-D array is being utilized as a matrix in RPerl, then the data structure is stored in "row-major form", which means the first dereference operation will give you a whole row, and the second dereference will give you an individual column element from within the selected row.

    my integer $row_3_column_0_combined = $rows_and_columns_2D->[3]->[0];  # row and column dereferences, combined in one statement
    print 'have $row_3_column_0_combined = ', $row_3_column_0_combined, "\n";

    my integer_arrayref $row_3 = $rows_and_columns_2D->[3];                # row dereference only
    print 'have $row_3 = ', integer_arrayref_to_string($row_3), "\n";
    my integer $row_3_column_0_separated = $row_3->[0];                    # column dereference only
    print 'have $row_3_column_0_separated = ', $row_3_column_0_separated, "\n";

If you run the code example above with the value of $rows_and_columns_2D given in this section, then you should receive the following output:

    have $row_3_column_0_combined = 9
    have $row_3 = [9, 8, 7, 6, 5]
    have $row_3_column_0_separated = 9

If you want to select an entire matrix column instead of a row, then you can change the first dereference operation's index to select each row one-at-a-time, while keeping the second dereference's index set to the desired column:

    my integer_arrayref $column_3 = [];
    $column_3->[0] = $rows_and_columns_2D->[0]->[3];
    $column_3->[1] = $rows_and_columns_2D->[1]->[3];
    $column_3->[2] = $rows_and_columns_2D->[2]->[3];
    $column_3->[3] = $rows_and_columns_2D->[3]->[3];
    $column_3->[4] = $rows_and_columns_2D->[4]->[3];
    # ... and so on, for each row in $rows_and_columns_2D (if there are more than 5 rows as shown above)
    print 'have $column_3 = ', integer_arrayref_to_string($column_3), "\n";

If you run the code example above, then you should receive the following output:

    have $column_3 = [6, 7, 1, 6, 5]

It is important not to confuse a 1-D array with a 2-D array which only contains one row. A 1-D array will only require one dereference in order to access individual elements, while a 2-D array will require two dereferences to achieve the same thing. For a 2-D array containing only one row, the first dereference should always be at index 0 to retrieve the whole row.

    my integer_arrayref          $foo_1D =  [3, 6, 9];
    my integer_arrayref_arrayref $foo_2D = [[3, 6, 9]];

    print 'have 1-D single dereference = ', $foo_1D->[2],      "\n";
    print 'have 2-D double dereference = ', $foo_2D->[0]->[2], "\n";

Both dereference statements in the example above should access the numeric value of 9, taken from $foo_1D and then $foo_2D, respectively:

    have 1-D single dereference = 9
    have 2-D double dereference = 9

Section 3.6: Quote Word qw() Operator

Sometimes you will want to create an array which contains multiple string literals, which can be achieved either by normal use of quotes and commas, or by use of the "quote word" operator qw().

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Quote Word

qw( )

Variadic

Closed

01

Left

Coming Soon

Let's return to a source code example from "Section 3.1: Lists vs Arrays", which shows a common use of the qw() operator, compared below with the equivalent code using quotes and commas:

    my string_arrayref $foo_1 = [qw(a c e)];
    my string_arrayref $bar_1 = ['a', 'c', 'e'];  # 4 characters less

In the example above, we can reduce our source code by 4 characters by utilizing the qw() operator. The benefit of using qw() only increases with the number of array elements.

    my string_arrayref $foo_2 = [qw(a b c d e f g)];
    my string_arrayref $bar_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g'];  # 16 characters less
 
    my string_arrayref $foo_3 = [qw(a b c d e f g h i j k l m)];
    my string_arrayref $bar_3 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'];  # 34 characters less

Now let's take a look at the data we have stored in our array reference variables:

    print 'have $foo_1 = ', string_arrayref_to_string($foo_1), "\n";
    print 'have $bar_1 = ', string_arrayref_to_string($bar_1), "\n";
    print 'have $foo_2 = ', string_arrayref_to_string($foo_2), "\n";
    print 'have $bar_2 = ', string_arrayref_to_string($bar_2), "\n";
    print 'have $foo_3 = ', string_arrayref_to_string($foo_3), "\n";
    print 'have $bar_3 = ', string_arrayref_to_string($bar_3), "\n";
    have $foo_1 = ['a', 'c', 'e']
    have $bar_1 = ['a', 'c', 'e']
    have $foo_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
    have $bar_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
    have $foo_3 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm']
    have $bar_3 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm']

As we can see in the output above, the data stored by the qw() operator is identical to the data stored by the normal quotes and commas technique, and qw() requires fewer characters of source code when there are at least 2 string array elements.

Section 3.7: Array Assignment

As mentioned in "Section 3.1: Lists vs Arrays", normal Perl allows you to perform a special kind of array assignment with multiple variables on the left (receiving) side of the equal-sign = assignment operator. This is one of Perl's countless shortcuts, allowing a software developer to initialize or modify more than one variable in a single statement.

    my ($x, $y, $z)       = @array_by_value;  # fine in Perl, error in RPerl, stored-by-value array assigned to list of scalar variables
    my ($foo, $bar, $bat) = (10, 20, 30);     # fine in Perl, error in RPerl, list                  assigned to list of scalar variables

In RPerl, we support exactly one variable on the left side of each assignment operator. If you need to initialize or modify multiple variables, simply separate each into its own statement. The source code below achieves the same goals as the code above, without utilizing special array assignment statements. Each line of source code above has been expanded into three equivalent lines below.

    my integer $x = $array_by_reference->[0];  # fine in Perl, fine in RPerl, stored-by-reference array element assigned to scalar variable
    my integer $y = $array_by_reference->[1];
    my integer $z = $array_by_reference->[2];

    my integer $foo = 10;                      # fine in Perl, fine in RPerl, scalar assigned to scalar variable
    my integer $bar = 20;
    my integer $bat = 30;

Section 3.8: push & pop Operators

Often you will want to append one or more elements to the end of an existing array, and equally often you will want to remove one or more elements from the end of an array. For these use cases, you may choose the push and pop operators. (The end of an array is sometimes called the "tail" of the array, and likewise the beginning of an array is sometimes called its "head".)

The push operator will add one or more new elements to the end of a dereferenced array, and will return the new array length. The pop operator will remove one element from the end of a dereferenced array, and will return the removed element. You may optionally ignore the return value of either operator.

Both the push and pop operators require array arguments to be dereferenced, which means you must always use the closed-fixity at-sign-curly-braces @{ } array dereference operator for arrays, as demonstrated in the code examples below.

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Push

push

Variadic

Prefix

01

Left

Coming Soon

Pop

pop

Unary

Prefix

01

Left

Coming Soon

Section 3.9: shift & unshift Operators

In the previous section, we saw how to append and remove elements from the end of an array, using push and pop. In this section, we will work with the beginning of an array instead of the end, using the shift and unshift operators.

The shift operator will remove one element from the beginning of a dereferenced array, and will return the removed element, in the same way the pop operator works on the end of an array. The unshift operator will add one or more elements to the beginning of a dereferenced array, and like push will return the new array length.

Like the push and pop operators, both shift and unshift require array arguments to be dereferenced, so don't forget to use the at-sign-curly-braces @{ } array dereference operator every time.

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Shift

shift

Unary

Prefix

01

Left

Coming Soon

Unshift

unshift

Variadic

Prefix

01

Left

Coming Soon

Section 3.10: Range .. Operator

Often you will want to generate a list of integers increasing by one. For this purpose, the natural choice the range operator .., sometimes called "dot-dot".

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Range

..

Binary

Infix

17

Non

Coming Soon

Remember, if you pass value 0 as the first argument for the range operator .., whether for use as an array index or loop iterator variable or any other purpose, then you will generate a list of length 1-greater than the operator's second argument. So, a list from 1 to 5 has length 5, and a list from 0 to 5 has length 6. As an extension of the array-indices-start-at-zero issue, this can be a common mistake or mixup for new programmers.

    my integer_arrayref $foo = [1 .. 5];
    print 'have $foo = ', integer_arrayref_to_string($foo), "\n";
    print 'have $foo length = ', (scalar @{$foo}), "\n";
    have $foo = [1, 2, 3, 4, 5]
    have $foo length = 5
    my integer_arrayref $foo = [0 .. 5];
    print 'have $foo = ', integer_arrayref_to_string($foo), "\n";
    print 'have $foo length = ', (scalar @{$foo}), "\n";
    have $foo = [0, 1, 2, 3, 4, 5]
    have $foo length = 6
    my integer_arrayref $foo = [0 .. 5];
    push @{$foo}, 21, 12, 23;
    print 'have $foo = ', integer_arrayref_to_string($foo), "\n";
    print 'have $foo length = ', (scalar @{$foo}), "\n";
    have $foo = [0, 1, 2, 3, 4, 5, 21, 12, 23]
    have $foo length = 9
    my integer_arrayref $foo = [0 .. 5];
    push @{$foo}, 20 .. 25;
    print 'have $foo = ', integer_arrayref_to_string($foo), "\n";
    print 'have $foo length = ', (scalar @{$foo}), "\n";
    have $foo = [0, 1, 2, 3, 4, 5, 20, 21, 22, 23, 24, 25]
    have $foo length = 12

Section 3.11: Converting From Array To String

The terms "stringify", "stringification", and "pretty print" refer to type conversion from any non-string data type or data structure into a string data type.

As mentioned in this chapter's opening section "CHAPTER 3: ARRAY VALUES & VARIABLES", RPerl currently provides the following array stringification subroutines, with support for all RPerl array data types coming soon:

The internals of the integer_arrayref_to_string() subroutine are implemented by the copied-and-pasted section of RPerl source code below, for Perl operations & Perl data types mode:

    # change this to hold your own data
    my integer_arrayref $input_avref = [8, 23, 17];

    # [ BEGIN COPIED-AND-PASTED CODE ]

    # declare local variables, av & sv mean "array value" & "scalar value" as used in Perl core
    my integer $input_av_length;
    my integer $input_av_element;
    my string $output_sv;
    my boolean $i_is_0 = 1;

    # compute length of (number of elements in) input array
    $input_av_length = scalar @{$input_avref};

    # begin output string with left-square-bracket, as required for all RPerl arrays
    $output_sv = '[';

    # loop through all valid values of $i for use as index to input array
    for my integer $i ( 0 .. ( $input_av_length - 1 ) ) {
        # retrieve input array's element at index $i
        $input_av_element = $input_avref->[$i];

        # append comma & space to output string for all elements except index 0
        if ($i_is_0) { $i_is_0 = 0; }
        else         { $output_sv .= ', '; }

        # stringify individual integer element, append to output string
        $output_sv .= integer_to_string($input_av_element);
    }

    # end output string with right-square-bracket, as required for all RPerl arrays
    $output_sv .= ']';

    # [ END COPIED-AND-PASTED CODE ]

    # display stringified output
    print $output_sv;

In the RPerl system source code above, you will see at least 1 new concept: the usage of a for loop.

The input to this source code is an integer_arrayref accessed via the variable $input_avref, and the output generated is a string in the variable $output_sv. (The term "avref" refers to "Array Value Reference", and likewise "sv" refers to "Scalar Value"; these terms are used in the internals of the Perl 5 interpreter itself, and copied in RPerl's internals for consistency.)

The for loop is used to access each individual element of the input array, one at a time; each element is then, in turn, stringified and appended to the output string. You will note the integer_to_string() stringification subroutine is called from within the integer_arrayref_to_string() subroutine, which makes sense because an integer_arrayref data structure is obviously composed of individual integer data types.

Running the code example above generates the following output string from $output_sv, which is an exact replica of the original RPerl integer_arrayref input data from $input_avref:

    [8, 23, 17]

(Please see "Section 4.5.1: my Intermittent Variables" for the full subroutine integer_arrayref_to_string().)

Section 3.12: Program Control Using The for & foreach Loops

In the previous section, you saw a for loop control structure used to stringify an array in the integer_arrayref_to_string() subroutine. In this section, we'll learn more about the for loop and its closely-related counterpart, the foreach loop.

Please take a few minutes to review "Section 2.9: Program Control Using The while Loop" if you have not yet done so; this should prove helpful because the for and foreach loops share some things in common with the while loop control structure. When compared with the much more loosely-structured while loop, the RPerl compiler may have more opportunities to perform automatic optimizations on for and foreach loop control structures. Unlike loop iterator variables in a while loop, iterators in a for or foreach loop should only be modified in the loop header, in order to avoid hard-to-diagnose behavior and also to increase the opportunities for automatic optimization.

BEST PRACTICES

  • When possible, always use a for or foreach loop instead of a while loop.
  • In a for or foreach loop, never modify the loop iterator variable inside the loop body, only inside the loop header.

In normal Perl, there is no difference whatsoever between the for and foreach loops; the 2 keywords are synonymous and you may freely interchange them as you like. This follows the Perl philosophy of TIMTOWTDI (There Is More Than One Way To Do It).

In RPerl, the for and foreach loops are used for separate, but related, purposes. This follows the RPerl philosophies of TDNNTBMTOWTDI (There Does Not Need To Be More Than One Way To Do It) and TIOFWTDI (There Is One Fastest Way To Do It).

In RPerl, the for loop itself is actually 2 different control structures: the "range" for loop and the "C-style" for loop. The range for loop is so named due to its use of the range operator .., which you may recall from "Section 3.10: Range .. Operator". The C-style for loop derives its name from its origins in the foundational programming language "C", which is the language used to build most modern technology such as Linux and Perl and RPerl. (Of course we prefer programming in Perl over C, so that's why we created RPerl in the first place, because RPerl generates the C and C++ code for us!)

Section 3.12.1: The Range for Loop

Often you will want your loop iterator variable to simply increase by one each time the loop iterates, starting and ending at some specific values. For this purpose, you may choose the range for loop control structure.

    for my integer $i ( 1 .. 5 ) {
        print 'have $i = ', $i, "\n";
    }
    have $i = 1
    have $i = 2
    have $i = 3
    have $i = 4
    have $i = 5

Range for loop control structures always take the following form:

    for my integer $i ( $lower .. $upper ) {
        # PERFORM OPERATIONS INSIDE LOOP
    }

In the source code example above, $i is the loop iterator variable, which you will hopefully recall from "Section 2.9.1: Loop Iterator Variables". By definition, a loop iterator variable must always be a variable, although it may be named something other than $i if you like. The $lower and $upper variables are the two operands to the infix range operator .., and they may be any expression, literal, or variable which produces a numeric value. The number of times this loop will iterate is ($upper - $lower) + 1, and each time the loop iterates it will effectively cause the value of $i to increase by one, starting at $lower and ending at $upper.

In all range for loops, the iterator variable ($i in this example) is "lexically scoped" by virtue of Perl's my keyword, which means this specific instance of $i is valid within this specific loop header and body only. Any other independent loop(s) within the same Perl program may also utilize an iterator variable with the exact same name $i, but it will be a totally new instance of $i, and thus will not be connected in any way other than the coincidence of sharing an iterator variable name. Also, once you utilize a variable name as a loop iterator, then you must not declare or utilize that same variable name except within other independent loops (as just described), or within a totally different scope such as a different source code file, etc. Failure to do so will result in a C++ compile-time error, as illustrated in the next two source code examples.

The following example shows multiple instances of the loop iterator variable $i, without any errors:

    for my integer $i ( 0 .. 10 ) {
        # do stuff inside first loop, using $i; fine
    }

    # do stuff outside loops, not using $i; fine
    my integer $x = 100;

    # fine
    for my integer $i ( 20 .. 30 ) {
        # do stuff inside second loop,
        # using totally independent instance of $i
    }

The following example shows multiple instances of the loop iterator variable $i, with an error:

    for my integer $i ( 0 .. 10 ) {
        # do stuff inside first loop, using $i; fine
    }

    # do stuff outside loops, using $i; causes error in second loop below
    my integer $i = 100;

    # ERROR ECOGEASCP012, CODE GENERATOR, ABSTRACT SYNTAX TO C++: 
    # variable i already declared in this scope, namespace main::, subroutine/method main(), dying
    for my integer $i ( 20 .. 30 ) {
        # do stuff inside second loop,
        # using totally independent instance of $i
    }

We have recently seen the range for loop in "Section 3.11: Converting From Array To String", used inside the integer_arrayref_to_string() subroutine. Here is the header and first operation of that same range for loop, isolated from the rest of the code:

    # loop through all valid values of $i for use as index to input array
    for my integer $i ( 0 .. ( $input_av_length - 1 ) ) {
        # retrieve input array's element at index $i
        $input_av_element = $input_avref->[$i];

        # PERFORM OPERATIONS USING $input_av_element
    }

In the source code example above, the range operator .. begins at 0 and ends at ( $input_av_length - 1 ); remember, array indices start at 0, so we must always be careful to subtract one from the array's length when calculating the range operator's second operand. (If you forget to subtract one from the second operand, then your code will probably attempt to access one element beyond the end of the array, which generally causes errors, crashes, or unexpected behavior.) Inside the loop body, we access the element of array $input_avref located at index $i, and store the element inside the $input_av_element variable for further use in the loop (not shown). This loop will access each and every element of the array, in order, one at a time.

Section 3.12.2: The C-Style for Loop

Sometimes you need more control over your loop iterator variable than the range for loop can provide; for this, you may choose the C-style for loop. However, the additional complexity of C-style for loop control structures may result in fewer opportunities for the RPerl compiler to perform automatic optimizations when compared with range for loops.

BEST PRACTICES

  • When possible, always use a range for loop instead of a C-style for loop.

In the range for loop, there is one operation in the loop header, which is the range operator ... In the C-style for loop header, there are three operations which all pertain to the loop iterator variable, separated by semicolon ; characters: initialization, continuation condition, and incrementation. The initialization operation is an integer variable declaration which is only executed once, before the very first loop iteration. The continuation condition operation is an inequality operator which is evaluated for boolean truth value before each loop iteration, and the loop will continue iterating as long as the condition evaluates to true. The incrementation operation is generally an arithmetic incrementation operator which executes once at the end of every loop iteration.

All RPerl source code files which utilize a C-style for loop must also include a specific ## no critic qw(ProhibitCStyleForLoops) directive to avoid triggering an RPerl syntax error.

    ## no critic qw(ProhibitCStyleForLoops)
    for ( my integer $i = 1; $i <= 5; $i++ ) {
        print 'have $i = ', $i, "\n";
    }
    have $i = 1
    have $i = 2
    have $i = 3
    have $i = 4
    have $i = 5

In the source code example above, the initialization operation is my integer $i = 1, which achieves the same outcome as the for my integer $i and 1 .. components in the previous section's range for loop header. The continuation condition is $i <= 5, which is equivalent to the range for loop header's .. 5 component. Finally, the incrementation operation is $i++, which equivalent to the implied increment-by-one of the range operator .. itself.

C-style for loop control structures always take the following form:

    ## no critic qw(ProhibitCStyleForLoops)
    for ( my integer $i = $lower; $i <= $upper; $i++ ) {
        # PERFORM OPERATIONS INSIDE LOOP
    }

In the source code example above, $i is the loop iterator variable, which you may name something other than $i, as with the range for loop. The $lower and $upper variables may be any expression, literal, or variable which produces a numeric value. The less-or-equal <= operator may be any comparison operator; if less-or-equal is used as shown in this example, then the number of times this loop will iterate is ($upper - $lower) + 1, again matching the behavior of the simpler range for loop control structure. The $i++ auto-increment operator may be any operation, although the obvious goal should be to increment or modify the loop iterator variable in some way. As with the range for loop example in the previous section, each time this loop iterates it will cause the value of $i to increase by one, starting at $lower and ending at $upper.

As with range for loops, all C-style for loops have lexically scoped iterator variables. If you attempt to utilize a loop iterator variable outside of a loop, you will likely experience a C++ compile-time error.

Let's return again to our example from the integer_arrayref_to_string() subroutine, now converted to a C-style for loop:

    ## no critic qw(ProhibitCStyleForLoops)
    # loop through all valid values of $i for use as index to input array
    for ( my integer $i = 0; $i < $input_av_length; $i++ ) {
        # retrieve input array's element at index $i
        $input_av_element = $input_avref->[$i];

        # PERFORM OPERATIONS USING $input_av_element
    }

In this modified source code example, the loop header's initialization operation is my integer $i = 0. The continuation condition has been simplified and optimized by removing the subtract-one - 1 operation and utilizing a less-than < operator instead of a less-or-equal <= operator. (RPerl is usually capable of automatically performing this specific optimization on the range operator's second operand in range for loops, where applicable.) The incrementation operation is a simple $i++ auto-increment operator which increases the value of iterator variable by one after each loop iteration. The loop body remains unchanged, and we can access the elements of array $input_avref in the exact same manner as in the original example. This C-style for loop will access each element of the array in order, and is functionally equivalent to the original range for loop.

Section 3.12.3: The foreach Loop

Sometimes you do not need a loop iterator variable at all, and instead you just need to access each element of an array in order; for this, you may choose the foreach loop control structure. By utilizing a foreach loop, you can rest assured that no code inside your loop body will cause problems by modifying your iterator variable, because there is no iterator to begin with!

BEST PRACTICES

  • When possible, always use a foreach loop instead of a for or while loop.

In the range for loop header there is one operation; in the C-style for loop header there are three operations; and in the foreach loop header there is one operation, the element variable declaration, followed by a list of elements.

Because the foreach loop has no iterator variable, we can skip directly to initializing the variable which will store each loop iteration's respective element, and this now occurs in the loop header instead of inside the loop body. The list of elements in a foreach loop header is where we get the individual elements accessible during each loop iteration.

    foreach my string $element ( 'smile', 'frown', 'blank stare', 'wince', 'pucker' ) {
        print 'have $element = ', $element, "\n";
    }
    have $element = smile
    have $element = frown
    have $element = blank stare
    have $element = wince
    have $element = pucker

In the source code example above, the declaration operation is my string $element, followed by the element list containing three string literals which describe common facial expressions.

foreach loop control structures always take the following form:

    foreach my string $element ( 'first', 'second', 'third' ) {
        # PERFORM OPERATIONS INSIDE LOOP
    }

In the source code example above, $element is the loop element variable of data type string, which you may name something other than $element and give any data type as appropriate. The list elements 'first', 'second', 'third' may be any expression(s), literal(s), or variable(s) which produce a valid list. It is also quite common to utilize the at-sign-curly-braces array dereference operator @{ } within the loop header's parentheses, in order to load list elements from within a pre-existing array variable (as seen in the next example below). The number of times this loop will execute is equal to the number of elements in the list, which is three in the current example.

As with the iterator variables in for loops, all the element variables in foreach loops are lexically scoped. If you attempt to utilize a loop element variable outside of a loop, you will likely experience a C++ compile-time error.

Let's return once more to our example from the integer_arrayref_to_string() subroutine, now finally converted to a foreach loop:

    # loop through all valid elements in input array
    foreach my integer $input_av_element ( @{$input_avref} ) {
        # PERFORM OPERATIONS USING $input_av_element
    }

In this re-modified source code example, the loop header's declaration operation is my integer $input_av_element. The list elements are taken directly from the $input_avref variable by use of the array dereference operator @{ }. The loop iterator variable is completely removed, as are the range operator, the continuation condition inequality operator (including subtract-one operation), and the incrementation operator in the loop header, as well as the array element retrieval operation inside the loop body. The resulting code is significantly simpler, cleaner, and less error-prone than before.

Section 3.13: Punctuation Variables & Magic

In normal Perl, you may access and modify a number of "punctuation variables" AKA "special variables", which are created by the Perl language itself and may be utilized to achieve many different goals. Perl punctuation variables derive their collective name from their sometimes-cryptic combinations of punctuation characters, such as the dollar-sign $, at-sign @, percent-sign %, underscore _, and exclamation-point ! characters.

A semi-random (and relatively tiny) selection of Perl's numerous punctuation variables follows:

In the list above, the left-most item on each line is the punctuation form of each punctuation variable, followed by the plain-English equivalent(s). Out of all these (and the many other unlisted) punctuation variables, only @ARG is utilized in RPerl, and then only in the plain-English format, never the @_ punctuation format. Please see "CHAPTER 4: ORGANIZING BY SUBROUTINES" for more information on the @ARG special variable.

Practically all of the punctuation variables' functionality is based on Perl's high-magic software components. You may recall the mention of Perl's magic in the following sections:

The Low-Magic Perl Commandments (LMPCs) put an even finer point on it:

In keeping with the 23rd LMPC, RPerl does not support the use of punctuation variables.

Further, in keeping with the 11th LMPC, RPerl does not support the use of any high-magic Perl features, which is the basis of RPerl's ability to produce super-high-speed output. Future versions of RPerl may include support for medium-magic Perl features, and perhaps even high-magic features some day, but the default RPerl compile mode will always be based on the low-magic RPerl grammar, which generates the most high-performance output code possible.

Please review all of The Low-Magic Perl Commandments for more information about which Perl features are high-magic and which are low-magic.

Section 3.14: reverse Operator

Sometimes you will want to reverse the order in which a list or array's elements are stored, whereby the last element becomes first, and vice-versa. For this purpose, you may choose the reverse operator.

Like push & pop & friends, the reverse operator requires array arguments to be dereferenced, so don't forget to use the closed-fixity at-sign-curly-braces @{ } array dereference operator where appropriate.

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Reverse

reverse

Variadic

Prefix

01

Left

Coming Soon

Section 3.15: sort Operator

Sometimes you will want to sort the order in which a list or array's elements are stored, whereby the element with the lowest value becomes first, and the element with the highest value is last. For this purpose, you may choose the sort operator.

Like reverse, the sort operator requires arrays to be dereferenced, so you must use the closed-fixity at-sign-curly-braces @{ } array dereference operator for array arguments.

Name

Symbol

Arity

Fixity

Precedence

Associativity

Supported

Sort

sort

Variadic

Prefix

01

Left

Coming Soon

Section 3.16: Scalar & Array Contexts

In normal Perl, usually there is no explicit indication of the data type of an expression's return value; in other words, normal Perl code doesn't know ahead of time if any particular expression will return a string or an integer or an array, etc. RPerl solves this issue once-and-for-all by providing a fully-fledged data type system, as you have been learning about throughout this book.

Normal Perl attempts to approach this issue through a few different mechanisms, one of which is "context", meaning the requirements or expectations of the receiving operation. In other words, Perl tries to guess what kind of data you want, and may even change your data without telling you, based on the requirements of your own code. In this section we will consider scalar context, and how it relates to array context (AKA list context).

All three lines of code in the following example enforce scalar context on the right-hand-side of the equal-sign =, because the receiving operation on the left-hand-side is a scalar (integer) variable modification and thus requires a scalar value as input:

    my integer $i =                   3;  # scalar variable receiving scalar value, context    match, fine in Perl, fine  in RPerl
    my integer $j = scalar @{[2, 4, 6]};  # scalar variable receiving scalar value, context    match, fine in Perl, fine  in RPerl
    my integer $k =        @{[2, 4, 6]};  # scalar variable receiving array  value, context mismatch, fine in Perl, error in RPerl compiled modes
    print 'have $i = ', $i, "\n";
    print 'have $j = ', $j, "\n";
    print 'have $k = ', $k, "\n";

When executing the source code above in non-compiled mode, the following output is generated:

    have $i = 3
    have $j = 3
    have $k = 3

However, when we try to compile the source code, we experience an error:

    ERROR ECOGEASCP879, CODE GENERATOR, ABSTRACT SYNTAX TO C++: Array dereference of array reference must provide data type for array reference in CPPOPS_CPPTYPES mode, but no data type provided, dying

In non-compiled mode, the normal Perl interpreter will detect the context mismatch and force an implicit context switch from the array @{[2, 4, 6]} to the scalar $k. This same array-to-scalar context switch is achieved explicitly via the scalar operator. Both implicit and explicit context switches have the same effect, an array value put into scalar context will simply return the number of array elements.

The ability for Perl to guess and imply context is derived from high-magic features. Once again, the Low-Magic Perl Commandments are clear on the issue:

In keeping with the 57th LMPC, RPerl does not support the use of implied or mismatched context.

When you need to force an array or list into scalar context, which is the same as asking for an element count, then simply utilize the scalar operator as demonstrated above. Do not try to force a scalar into an array context, it makes no sense in RPerl and will only produce errors in your code.

Section 3.17: STDIN & Arrays

When we combine a while loop control structure with the <STDIN> input data stream, then the while loop is able to detect a special signal from the operating system, known as "End-Of-Transmission" (EOT) or "End-Of-File" (EOF), which tells the loop there is no more user input available. The EOF condition is triggered when a user presses the <CTRL-D> key combination on their keyboard.

Data collected from <STDIN> is always of the string data type, and each collected line always contains an extra newline "\n" character at the end, due to the user pressing the <ENTER> key after each line. (If the user presses only <CTRL-D> and not <ENTER> after the last line of input, then the extra newline character will be missing on the last line collected.) We can use the safe chomp operator to remove any trailing newline(s) characters if present, and push each collected line of input onto an array:

    my string_arrayref $input_strings = [];
    print 'Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:', "\n";
    while (my string $input_string = <STDIN>) {
        chomp $input_string;
        push @{$input_strings}, $input_string;
    }
    print "\n";
    print 'have $input_strings = ', string_arrayref_to_string($input_strings), "\n";

When we run the source code example above, we will be prompted for input until we signal EOF:

    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    alpha
    beta
    gamma
    delta
    epsilon

    have $input_strings = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']

If we press <CTRL-D> immediately, then no input is collected and we have an empty array, zero elements:

    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:

    have $input_strings = []

If we press <ENTER> once before pressing <CTRL-D>, then a blank line of input is collected and we have an array with one element, an empty string:

    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:


    have $input_strings = ['']

If you want to accept only a certain number of input lines, instead of allowing the user to end input with <CTRL-D>, then one option would be to replace the while loop with a for loop instead. This is left as an exercise for the reader.

Section 3.18: Exercises

1. Reverse Collected Array Of Strings [ 30 mins ]

In the same LearningRPerl directory from the chapter 1 & 2 exercises, create a new sub-directory named Chapter3. In the new sub-directory, create an RPerl program file named exercise_1-stdin_strings_reverse.pl.

Use a while loop to collect data from <STDIN> until the user signals EOF, storing the collected strings in an array variable named $input_strings. Use the reverse operator and store the new data in a different array variable named $input_strings_reversed. Finally, use a foreach loop to print all the elements in reverse order, one-at-a-time. (Do not use the string_arrayref_to_string() subroutine.)

Your program should produce exactly the following output, when provided with the first seven letters of the alphabet as input:

    $ rperl -t LearningRPerl/Chapter3/exercise_1-stdin_strings_reverse.pl 
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    a
    b
    c
    d
    e
    f
    g

    Strings in reverse order:
    g
    f
    e
    d
    c
    b
    a

2. Collect Array Indices [ 45 mins ]

Use the qw() operator to store the following list of words in an array variable named $flintstones_and_rubbles, in exactly this order:

Next, use a while loop to collect data from <STDIN> until the user signals EOF, converting the collected strings to integers and storing them in an array variable named $input_indices.

Finally, use a foreach loop to print all the elements of $flintstones_and_rubbles which correspond to the collected user input array indices, with the caveat that we do not expect the user to know array indices begin at zero, so the element fred is displayed by user input 1 instead of 0.

Your program should produce exactly the following output, when provided with the first four odd integers as input:

    $ rperl -t LearningRPerl/Chapter3/exercise_2-stdin_array_indices.pl 
    Please input zero or more integers with values ranging from 1 to 7, separated by <ENTER>, ended by <CTRL-D>:
    1
    3
    5
    7

    Flintstones & Rubbles:
    fred
    barney
    wilma
    bamm-bamm

3. Sort & Display Collected Array Of Strings [ 45 mins ]

Create a constant boolean value named SINGLE_LINE_OUTPUT, and set it to 0.

Next, use a while loop to collect data from <STDIN> until the user signals EOF, storing the collected strings in an array variable named $input_strings. Use the sort operator and store the new data in a different array variable named $input_strings_sorted.

Finally, use a foreach loop containing a nested if statement and chomp operator, in order to print all the elements in sorted order. If the value of SINGLE_LINE_OUTPUT is true, separate the elements by one space and display all on one line of output; otherwise, display each element on a new line of output.

Your program should produce exactly the output below, when SINGLE_LINE_OUTPUT is set to 0 and the following strings are provided as input:

    $ rperl -t LearningRPerl/Chapter3/exercise_3-stdin_strings_sort.pl 
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    gygax
    grue
    gorn
    galactus
    galadriel

    Strings in ASCIIbetical order:
    galactus
    galadriel
    gorn
    grue
    gygax

Now, set the value of SINGLE_LINE_OUTPUT to 1 and you should see the following:

    $ rperl -t LearningRPerl/Chapter3/exercise_3-stdin_strings_sort.pl 
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    gygax
    grue
    gorn
    galactus
    galadriel

    Strings in ASCIIbetical order:
    galactus galadriel gorn grue gygax


CHAPTER 4: ORGANIZING BY SUBROUTINES

As we have seen in the preceding chapters, Perl is an operator-rich language, meaning there are a relatively large number of built-in operators available to Perl programmers. However, sometimes you will want to create your own user-defined operation, which may be utilized like a built-in operator and which provides some custom functionality that's not already available. In these cases, you may create a new "subroutine".

As you may have expected, our first subroutine example will be our old favorite, "Hello, World!", implemented in RPerl:

    our void $hello_world = sub {
        print 'Hello, World!', "\n";
    };

In the source code example above, we have defined a very simple subroutine named hello_world(), which may be called as follows:

    hello_world();

When called, this subroutine simply prints the output:

    Hello, World!

Now let's dissect the above source code example into its individual components:

First, we see the keyword our, which is used in normal Perl to denote a global variable, and is used in RPerl for only a few specific purposes, most commonly to begin the definition of a subroutine. (We have already utilized the our keyword to declare the $VERSION variable in the header section of an RPerl program.) The definitions of all RPerl subroutines begin with the our keyword.

Second, we see the void data type, which provides the "return type" of the subroutine. All RPerl subroutines must specify exactly one return type, which is the data type of the value returned when the subroutine is finished running. A return type of void means this subroutine does not actually return any data whatsoever, which makes sense because all we have is a single print statement inside the subroutine, and generally the print statement is not considered to be an operator which returns a (useful) value. Normal Perl does not provide a type system, and thus can not provide the ability to specify subroutine return types; this is only possible in RPerl.

Third, we see the variable name $hello_world, which provides the name of the subroutine. The definition of an RPerl subroutine is the only place where you will include a dollar-sign $ as part of the subroutine name; everywhere else in your source code you will refer to the subroutine without the dollar-sign $ prefix and with an added parentheses ( ) suffix, so our example becomes hello_world() throughout the rest of your code. In normal Perl, a subroutine name is very rarely defined with a dollar-sign $ prefix - this is a bit of tricky magic (oh no!) used internally by RPerl in order to enable subroutine return types. Because a special subroutine reference value is stored in the global scalar variable named $hello_world, you may not create a normal variable with the same name $hello_world in the main operations section of your RPerl program. However, you are free to create scalar variables named $hello_world within this or any other subroutine.

After the RPerl subroutine name, we will always see an equal-sign = followed by the sub keyword and a left-curly-brace { character. As usual, the equal-sign = is the assignment operator, which assigns the special subroutine reference value to the global scalar variable $hello_world. Both RPerl and normal Perl use the sub keyword to define a subroutine, as well as the left-curly-brace { to denote the beginning of the subroutine's body.

Following the left-curly-brace { character is the subroutine's body, which is a code block consisting of an arbitrary number of operations, and is generally the same in both RPerl and normal Perl. In the example above, the body consists of exactly one operation, the print 'Hello, World!', "\n"; statement.

Finally, we see a right-curly-brace } followed by a semicolon ; character, used to denote the end of the subroutine's body. In normal Perl, there is no trailing semicolon ; used after the right-curly-brace } character.

Section 4.1: Subroutine Definitions

In software development, we have the two related concepts of "subroutine declaration" and "subroutine definition". To declare a subroutine, a software developer must provide (at least) the subroutine's name and return type. To define a subroutine, a developer must provide at least the subroutine's name and body. In RPerl, these two steps of declaration and definition are always combined into one single step, where we provide the subroutine's name and return type and body all together, as seen in the "Hello, World!" example in the previous section.

Normal Perl does not provide the ability to specificy a subroutine's return type, so the closest thing to the separation of subroutine declaration and definition in normal Perl is the usage of high-magic features such as the manipulation of subroutine references, etc. In the C and C++ programming languages, subroutine declarations and definitions are traditionally kept separate, often in completely different source code files.

Because RPerl subroutine declarations and definitions are always combined, we will simply use the term "subroutine definition" in this textbook.

RPerl subroutines with very short bodies may be written on a single line, so we can re-write the "Hello, World!" subroutine as a one-liner by simply deleting the two invisible newline characters:

    our void $hello_world = sub { print 'Hello, World!', "\n"; };

In normal Perl, the same subroutine would be written like this:

    sub hello_world { print 'Hello, World!', "\n"; }

Let's compare them more closely:

         sub  hello_world       { print 'Hello, World!', "\n"; }   #  Perl
    our void $hello_world = sub { print 'Hello, World!', "\n"; };  # RPerl

In the comparison above, we can quickly see the different placements of the sub keyword and RPerl's usage of the void return type. We can also easily point out the four additional bits of RPerl syntax not present in the normal Perl line: our, $, =, and ;.

In both RPerl and normal Perl, the capitalization of variables (and thus RPerl subroutine names) is significant, which is to say that subroutine names are case-sensitive. The subroutine name hello_world() is different than either Hello_World() or HELLO_world(), all of which are valid Perl (and RPerl) subroutine names. As detailed in "Section 2.5: Constant Data", the all-uppercase subroutine name HELLO_WORLD() is reserved for constant use only, as is any other all-uppercase subroutine or variable name in RPerl.

    our void $hello_world = sub { print 'Hello, World!', "\n"; };
    our void $Hello_World = sub { print 'Hello, World, again!', "\n"; };
    our void $HELLO_world = sub { print 'Hello, World, yet again!', "\n"; };

We can call these three similarly-named-yet-distinct subroutines as follows:

    hello_world();
    Hello_World();
    HELLO_world();

Doing so will print three different lines of output:

    Hello, World!
    Hello, World, again!
    Hello, World, yet again!

If we try to define a subroutine with an all-uppercase name, RPerl will give an error due to all-uppercase symbols being reserved for constants:

    our void $HELLO_WORLD = sub { print 'Hello, World!', "\n"; };

Before we can even try to call this HELLO_WORLD() subroutine, RPerl tells us we have made a mistake because an all-uppercase symbol is not a valid variable symbol (and thus not a valid RPerl subroutine name):

    ERROR ECOPARP00, RPERL PARSER, RPERL SYNTAX ERROR
    Failed RPerl grammar syntax check with the following information:
    
        File Name:         ...
        Line Number:       ...
        Unexpected Token:  $HELLO_WORLD
        Expected Token(s): VARIABLE_SYMBOL

As long as you are careful to use proper capitalization, then you are free to utilize all-uppercase constant names such as PIE() alongside not-all-uppercase subroutine names:

         sub  sweet_tooth       { print 'Yum!  I love ', PIE(), ' pie.', "\n"; }   #   usable in Perl, error  in RPerl
    our void $SWEET_TOOTH = sub { print 'Yum!  I love ', PIE(), ' pie.', "\n"; };  # unusable in Perl, error  in RPerl
    our void $Sweet_Tooth = sub { print 'Yum!  I love ', PIE(), ' pie.', "\n"; };  # unusable in Perl, usable in RPerl
    our void $sweet_tooth = sub { print 'Yum!  I love ', PIE(), ' pie.', "\n"; };  # unusable in Perl, best   in RPerl

In the sweet_tooth() example above, the first line is usable only in normal Perl, the second line is not usable at all, and both the third and fourth lines are usable only in RPerl. Please see "Section 2.4.1: How To Select Expressive Variable Names" for best practices which apply equally to subroutine names as well as variable names. The fourth line fulfills best practices by naming the subroutine using all-lowercase letters.

Section 4.2: Subroutine Calls

As seen in the previous two sections, in order to run a subroutine we must call (AKA "invoke") the subroutine by name. This is achieved by simply typing the subroutine's name, without the dollar-sign $ prefix, and adding empty parentheses ( ) as a suffix. As with all Perl expressions, we end with a semicolon ; character. In this way, we can call the best practices version of the "sweet tooth" example from the previous section:

    sweet_tooth();

Utilizing the original constant value from "Section 2.5: Constant Data", invoking the subroutine will return the following output:

    Yum!  I love pecan pie.

Remember, as mentioned in the previous section, all subroutine names are case-sensitive and must be spelled exactly the same for every subroutine call. In the following five subroutine calls, only the first line is a valid call because the rest are all misspelled:

    sweet_tooth();  # fine
    sweeT_tootH();  # error
    swEEt_tooth();  # error
    sweet_tOOth();  # error
    sw33t_t00th();  # error

Section 4.2.1: Parentheses Suffix & Ampersand Sigil Prefix

In RPerl, all subroutine calls must include parentheses ( ) as a suffix immediately following the name of the subroutine, with no space between the last letter of the name and the left-parenthesis ( character. In normal Perl, you are allowed to have whitespace between the subroutine name and the parentheses characters, although this is both unnecessary and potentially quite confusing, thus it is against best practices. Likewise, it is not an error to include optional whitespace between the left-parenthesis ( and right-parenthesis ) characters, but again, best practices admonish us to allow no extraneous whitespace between the parentheses.

BEST PRACTICES

  • Do not include whitespace before the left-parenthesis ( character in a subroutine call.
  • Do not include unnecessary whitespace between the left-parenthesis ( and right-parenthesis ) characters in a subroutine call.

The parentheses suffix is optional in normal Perl, and is required in RPerl:

    sweet_tooth;     # fine in Perl, error in RPerl
    sweet_tooth ();  # fine in Perl, fine  in RPerl
    sweet_tooth( );  # fine in Perl, fine  in RPerl
    sweet_tooth();   # fine in Perl, best  in RPerl

If we try to call an RPerl subroutine without the parentheses ( ) suffix, we receive a Perl use strict; error:

    ERROR ECOPAPL02, RPERL PARSER, PERL SYNTAX ERROR
    Failed normal Perl strictures-and-fatal-warnings syntax check with the following information:
    
        File Name:        ./lib/RPerl/Test/Subroutine/program_00_bad_01.pl
        Return Value:     2
        Error Message(s): Bareword "hello_world" not allowed while "strict subs" in use

In normal Perl, subroutine calls may include an optional ampersand & sigil prefix, but this is unnecessary and thus unsupported by RPerl:

     sweet_tooth();  # fine in Perl, fine  in RPerl
    &sweet_tooth();  # fine in Perl, error in RPerl

If we try to call an RPerl subroutine using an ampersand & sigil, we receive a Perl::Critic policy violation:

    ERROR ECOPAPC02, RPERL PARSER, PERL CRITIC VIOLATION
    Failed Perl::Critic brutal review with the following information:
    
        File Name:    ...
        Line number:  ...
        Policy:       Perl::Critic::Policy::Subroutines::ProhibitAmpersandSigils
        Description:  Subroutine called with "&" sigil
        Explanation:  See Perl Best Practices page(s) 175

Section 4.3: Subroutine Return Values

In the previous sections, both the hello_world() and sweet_tooth() example subroutines have return values of void, which means no data is returned to the caller when these subroutines finish running. If we try to utilize the return value of a void subroutine, we will experience undefined behavior which will likely lead to other (possibly mysterious) errors in our code. For example, the variable $foo in the following line of source code will have an unknown and unpredictable value, because the return value of a void subroutine is undefined:

    my integer $foo = hello_world();  # undefined behavior, unpredictable value in $foo

Likewise, the outcome of the following addition + operator is unpredictable, and will probably cause other difficult-to-debug errors elsewhere in our source code:

    23 + hello_world()  # undefined behavior, unpredictable result

So, when we want to utilize the return value of a subroutine, we must give the subroutine a non-void return type in the subroutine definition.

    our integer $jedi = sub { print q{"You love him, don't you?"}, "\n"; return 6; };

We can call the jedi() subroutine defined above, and actually utilize the return value:

    my integer $episode = jedi();
    print 'Return (value) of the Jedi, Episode ', $episode, "\n";

The above subroutine call and print statement produce the following output:

    "You love him, don't you?"
    Return (value) of the Jedi, Episode 6

Your RPerl subroutines may have any return type chosen from the full list of supported RPerl data types, such as number, string, integer_arrayref, etc.

Section 4.3.1: return Operator

The jedi() subroutine example in the previous section includes a return operator at the end of the subroutine's body, which serves to return the value 6 to the caller. In other words, the input operand of the return operator is the numeric literal 6, which is then passed back to whatever part of the software originally invoked the jedi() subroutine.

The return operator serves two purposes, both returning a specific return value to the caller, as well as controlling termination of the subroutine at some point before the end of the subroutine's body. It is common to have multiple return operators inside one subroutine, organized using conditional statements or other logic:

    if ($foo < $bar) {
        return 23;
    } 
    else {
        return 42;
    }

A subroutine with void return type may also utilize the return operator, in order to control when and where the subroutine terminates:

    if ($foo < $bar) {
        print '$foo is too small, returning', "\n";
        return;
    }

    print '$foo is not too small, continuing', "\n";
    $foo = $bar * 17;

In normal Perl, the return operator is optional, and Perl automatically returns the vale of the last expression executed before the subroutine ends. However, leaving out the return operator in RPerl will once again lead to either an error, or undefined and unpredictable behavior in C++ compile modes.

            sub  jedi       { print q{"You love him, don't you?"}, "\n";        6; }   #   usable in Perl, error  in RPerl
    our integer $jedi = sub { print q{"You love him, don't you?"}, "\n";        6; };  # unusable in Perl, error  in RPerl
    our integer $jedi = sub { print q{"You love him, don't you?"}, "\n"; return 6; };  # unusable in Perl, usable in RPerl

If we exclude the return operator as in the second line above, then we will receive an error in RPerl compiled modes, because the semicolon ; of 6; is found before an actual operator is encountered:

    $ rperl jedi.pl

    ...

    ERROR ECOPARP00, RPERL PARSER, RPERL SYNTAX ERROR
    Failed RPerl grammar syntax check with the following information:
    
        File Name:         ...
        Line Number:       ...
        Unexpected Token:  ;
        ...

We can see undefined behavior when we fail to utilize a return operator in a subroutine which should return a value, in this case an integer should be returned but instead a string concatenation . operator is performed without an explicit return value:

    our integer $unpredictable = sub { 'howdy' . 'doody'; };
    print 'have unpredictable() = ', unpredictable(), "\n";

When we run the above unpredictable() subroutine in RPerl test mode -t, we see the result of the string concatenation operator; when we run the same subroutine in RPerl compiled mode, we receive a seemingly-random number instead:

    $ rperl -t unpredictable.pl
    have unpredictable() = howdydoody

    $ rperl unpredictable.pl 
    have unpredictable() = 140725691414304

We can try to add a return operator to the unpredictable() subroutine:

    our integer $unpredictable = sub { return 'howdy' . 'doody'; };
    print 'have unpredictable() = ', unpredictable(), "\n";

Adding the return operator will lead to a C++ subcompile error, visible in RPerl debug mode -D, because the string concatenation . operator's string return type does not match the unpredictable() subroutine's integer return type:

    $ rperl -D unpredictable_return.pl

    [[[ SUBCOMPILE STDERR ]]]

    unpredictable_return.cpp: In function ‘integer unpredictable()’:
    unpredictable_return.cpp:18:52: error: cannot convert ‘std::__cxx11::basic_string<char>’ to ‘integer {aka long int}’ in return
         return (const string) "howdy" + (const string) "doody";
                                                        ^
    ERROR ECOCOSU04, COMPILER, SUBCOMPILE: C++ compiler returned error code, croaking...

For all subroutines with non-void return types, you must always use the return operator, and always provide the return operator with operands of the correct return type.

Section 4.3.2: Multiple Return Values

In normal Perl, the return operator may optionally be provided with more than one operand, thereby causing its subroutine to actually return multiple values to the originating caller. In this case, the calling software must know how to properly receive back more than one return value from the invoked subroutine, or else you may receive the incorrect data or experience undefined behavior.

The following subroutine foo_multi() is valid in normal Perl only:

    sub foo_multi { return 21, 22, 23; }
    (my $retval0, my $retval1, my $retval2) = foo_multi();
    print 'have $retval0 = ', $retval0, "\n";
    print 'have $retval1 = ', $retval1, "\n";
    print 'have $retval2 = ', $retval2, "\n";

Running the normal Perl foo_multi() code above will produce the following output:

    have $retval0 = 21
    have $retval1 = 22
    have $retval2 = 23

On the other hand, any RPerl subroutine defined with a non-void return type must provide each of its return operators with exactly one operand, thereby causing the subroutine to return exactly one value to the originating caller. Likewise, all non-void subroutine calls in RPerl must be designed to receive back exactly one data value of the subroutine's defined return type. (As with many design aspects of the RPerl compiler, this particular requirement is based on the behavior of the C++ target language.)

    our integer $foo_single = sub { return 23; };
    my integer $retval = foo_single();
    print 'have $retval = ', $retval, "\n";
    have $retval = 23

When you want to return multiple values from an RPerl subroutine, you may simply return an array reference containing more than one element, then you will receive back a single array reference in the caller and you access each element as needed. In other words, you may wrap up multiple return values inside of a single array reference, and still be in compliance with the exactly-one-return-value requirement.

    our integer_arrayref $foo_multi = sub { return [21, 22, 23]; };
    my integer_arrayref $retvals = foo_multi();
    my integer $retval0 = $retvals->[0];
    my integer $retval1 = $retvals->[1];
    my integer $retval2 = $retvals->[2];
    print 'have $retval0 = ', $retval0, "\n";
    print 'have $retval1 = ', $retval1, "\n";
    print 'have $retval2 = ', $retval2, "\n";

Running the RPerl foo_multi() code above will produce the exact same variables and output as the normal Perl foo_multi() code:

    have $retval0 = 21
    have $retval1 = 22
    have $retval2 = 23

Currently, all RPerl subroutines utilizing an array reference return value must only return elements which all have the exact same data type as one another, because RPerl arrays are homogeneous with only one data type shared across all elements. As mentioned in "Section 3.2: 1-D Array Data Types & Constants", future versions of RPerl will probably provide a scalar_arrayref data structure which can hold elements of non-matching data types.

Section 4.4: Subroutine Arguments

When calling a subroutine, often you will want to provide one or more pieces of data as input to the subroutine; for this purpose, we use "subroutine arguments". Perl provides a special array variable named @ARG, which contains all arguments which have been passed into each subroutine by its caller. For any RPerl subroutine which accepts arguments, the first expression of the subroutine must always be the argument definition operation, as seen in the following source code example:

    our void $foo_arg  = sub {
        (my integer $arg1) = @ARG;
        print 'inside foo_arg(), have $arg1 = ', integer_to_string($arg1), "\n";
    };
    foo_arg(1_701);

Running the foo_arg() code example gives us this output:

    inside foo_arg(), have $arg1 = 1_701

It's easy for a subroutine to accept multiple arguments, simply separate each argument's variable declaration by a comma , character, and keep them all within the parentheses ( ) characters:

    our void $foo_args = sub {
        (my integer $arg1, my number $arg2, my string $arg3) = @ARG;
        print 'have $arg1 * $arg2 = ', number_to_string($arg1 * $arg2), "\n";
        print 'have $arg3 x $arg1 = ', ($arg3 x $arg1), "\n";
    };
    foo_args(5, 2.575, 'over and ');

Running the foo_args() code gives us:

    have $arg1 * $arg2 = 12.875
    have $arg3 x $arg1 = over and over and over and over and over and

Unsurprisingly, normal Perl provides magic which allows a subroutine to change its own argument names and data types at runtime, which is one of Perl's many dynamic behaviors. In the source code example below, we see the shift operator used to receive each argument one-at-a-time, where the name and (implied) data type of the second argument is dependent upon runtime conditional logic using the first argument:

    sub bar_args_dynamic {
        my $arg_type = shift @ARG;
        if ($arg_type eq 'integer') {
            my $bar_int = shift @ARG;
            print 'have $bar_int * 3 = ', $bar_int * 3, "\n";
        }
        else {
            my $bar_str = shift @ARG;
            print 'have $bar_str x 3 = ', $bar_str x 3, "\n";
        }
        return;
    }
    bar_args_dynamic('integer', 4);
    bar_args_dynamic('string', 'repeat');

Running the bar_args_dynamic() subroutine example produces the following output:

    have $bar_int * 3 = 12
    have $bar_str x 3 = repeatrepeatrepeat

In RPerl, the argument names and data types for each subroutine are set at compile time, which is one of RPerl's many static behaviors and which contributes to significant runtime performance optimizations. In other words, one reason why RPerl is faster than normal Perl is because RPerl knows the names and data types for each subroutine when the rperl compiler is run.

By always accepting three arguments, we can upgrade from the bar_args_dynamic() subroutine to the bar_args_static() subroutine, which can then be compiled with RPerl:

    our void $bar_args_static = sub {
        (my string $arg_type, my integer $bar_int, my string $bar_str) = @ARG;
        if ($arg_type eq 'integer') {
            print 'have $bar_int * 3 = ', $bar_int * 3, "\n";
        }
        else {
            print 'have $bar_str x 3 = ', $bar_str x 3, "\n";
        }
        return;
    };
    bar_args_static('integer', 4, q{});
    bar_args_static('string',  0, 'repeat');

Running bar_args_static() gives us the exact same variables and output as bar_args_dynamic(), now able to be compiled and run at a much faster speed:

    have $bar_int * 3 = 12
    have $bar_str x 3 = repeatrepeatrepeat

Section 4.4.1: Variadic Subroutines

The number of arguments accepted by a subroutine is known as its "arity". A subroutine, operator, or other operation which is capable of accepting a varying number of input arguments is known as "variadic". (Please see "D.3: Syntax Arity, Fixity, Precedence, Associativity" for more information.)

In the previous section, we saw the bar_args_dynamic() subroutine use the shift operator (and associated Perl magic) in order to control its own accepted arguments. The same magic can be used in normal Perl to change, at runtime, the number of arguments accepted by a subroutine.

In normal Perl, a variadic subroutine can be created as follows:

    sub baz_variadic_dynamic {
        my $num_args = shift @ARG;
        my $arg2 = q{};
        my $arg3 = q{};
        my $arg4 = q{};
        if ($num_args >= 2) { $arg2 = shift @ARG; }
        if ($num_args >= 3) { $arg3 = shift @ARG; }
        if ($num_args >= 4) { $arg4 = shift @ARG; }
        print 'have $num_args = ', $num_args, "\n";
        print 'have $args2 = ', $arg2, "\n";
        print 'have $args3 = ', $arg3, "\n";
        print 'have $args4 = ', $arg4, "\n\n";
        return;
    }
    baz_variadic_dynamic(1);
    baz_variadic_dynamic(2, 'howdy');
    baz_variadic_dynamic(3, 'howdy', 'doody');
    baz_variadic_dynamic(4, 'howdy', 'doody', 'time');

Running the normal Perl subroutine baz_variadic_dynamic(), we can see how the first argument $num_args is used to control how many additional arguments are accepted:

    have $num_args = 1
    have $args2 = 
    have $args3 = 
    have $args4 = 

    have $num_args = 2
    have $args2 = howdy
    have $args3 = 
    have $args4 = 

    have $num_args = 3
    have $args2 = howdy
    have $args3 = doody
    have $args4 = 

    have $num_args = 4
    have $args2 = howdy
    have $args3 = doody
    have $args4 = time

As we did with subroutine return values in "Section 4.3.2: Multiple Return Values", we can use an array reference to simulate a variadic subroutine while maintaining compatibility with RPerl:

    our void $baz_variadic_static = sub {
        (my integer $num_args, my string_arrayref $args) = @ARG;
        my string $arg2 = q{};
        my string $arg3 = q{};
        my string $arg4 = q{};
        if ($num_args >= 2) { $arg2 = $args->[0]; }
        if ($num_args >= 3) { $arg3 = $args->[1]; }
        if ($num_args >= 4) { $arg4 = $args->[2]; }
        print 'have $num_args = ', $num_args, "\n";
        print 'have $args2 = ', $arg2, "\n";
        print 'have $args3 = ', $arg3, "\n";
        print 'have $args4 = ', $arg4, "\n\n";
        return;
    };
    baz_variadic_static(1, []);
    baz_variadic_static(2, ['howdy']);
    baz_variadic_static(3, ['howdy', 'doody']);
    baz_variadic_static(4, ['howdy', 'doody', 'time']);

In the baz_variadic_static() subroutine example above, we accept an array reference argument named $args and retrieve its elements as controlled by the $num_args input argument, achieving the same outcome as in baz_variadic_dynamic(). You will note the variable $arg2 receives its value from $args->[0] which is a seeming index difference of 2, caused by array indices starting at 0 instead of 1, combined with the fact that we have already accepted $num_args as a separate argument altogether.

Also, when we call baz_variadic_static(), all arguments after the first must be wrapped in the square-brackets [ ], denoting an array reference.

As you may expect, running the baz_variadic_static() subroutine gives us the same variables and output as baz_variadic_dynamic(), with the added performance optimizations of the RPerl compiler:

    have $num_args = 1
    have $args2 = 
    have $args3 = 
    have $args4 = 

    have $num_args = 2
    have $args2 = howdy
    have $args3 = 
    have $args4 = 

    have $num_args = 3
    have $args2 = howdy
    have $args3 = doody
    have $args4 = 

    have $num_args = 4
    have $args2 = howdy
    have $args3 = doody
    have $args4 = time

Section 4.5: Subroutine Variables, Variable Scope & Persistence

In normal Perl, variables inside a subroutine may behave in several complex ways, depending upon if and how the variable was declared, which can happen both inside and outside the subroutine itself. Perl provides a number of different ways to create a variable, including the by-now familiar my keyword, as well as the our and state keywords, and others.

Perl variables possess many (often magic) properties which affect their behavior, two of which are "scope" and "persistence". The scope of a variable is the subroutine, code block, or other region of code within which the variable is valid and accessible. Attempting to utilize a variable outside of its scope will lead to errors and undefined behavior.

The persistence of a variable is how long the variable will remain valid, before its memory is released and it is no longer accessible from anywhere within your Perl program. As with out-of-scope variables, attepting to access a variable after its persistence has expired will result in errors and undefined behavior.

Perl variables declared using the my keyword are classified as possessing "local scope", as well as "intermittent persistence" (AKA "non-persistent"). This means my variables declared inside a subroutine will only be valid within their own subroutine (local scope), and must be re-declared each time the subroutine is executed (intermittent). In Perl terminology, "local scope" is generally synonymous with "lexical scope" and "private scope". The my variable is the safest, simplest, and most optimizable classification of Perl variable.

    use strict;
    #print 'before defining quux(), have $local_intermittent = ', $local_intermittent, "\n";  # YES ERROR
    our void $quux = sub {
        my integer $local_intermittent = 23;
        print 'inside quux(), have $local_intermittent = ', $local_intermittent, "\n";        # NO  ERROR
        $local_intermittent++;
    };
    #print 'after  defining quux(), have $local_intermittent = ', $local_intermittent, "\n";  # YES ERROR

    quux();
    #print 'after   calling quux(), have $local_intermittent = ', $local_intermittent, "\n";  # YES ERROR
    quux();
    #print 'after   calling quux(), have $local_intermittent = ', $local_intermittent, "\n";  # YES ERROR

Running the example code above, which utilizes a my variable, gives us the following output:

    inside quux(), have $local_intermittent = 23
    inside quux(), have $local_intermittent = 23

Perl variables declared using our are of "global scope" and are "persistent", which means our variables may be accessed from anywhere within your Perl program and will never need to be re-declared. This is one of the least safe, simple, and optimizable classifications of variable.

    use strict;
    our integer $global_persistent = 23;
    print 'before defining quux(), have $global_persistent = ', $global_persistent, "\n";  # NO ERROR
    our void $quux = sub {
        print 'inside          quux(), have $global_persistent = ', $global_persistent, "\n";       # NO ERROR
        $global_persistent++;
    };
    print 'after  defining quux(), have $global_persistent = ', $global_persistent, "\n";  # NO ERROR

    quux();
    print 'after   calling quux(), have $global_persistent = ', $global_persistent, "\n";  # NO ERROR
    quux();
    print 'after   calling quux(), have $global_persistent = ', $global_persistent, "\n";  # NO ERROR

Running this our variable example code produces the output below:

    before defining quux(), have $global_persistent = 23
    after  defining quux(), have $global_persistent = 23
    inside          quux(), have $global_persistent = 23
    after   calling quux(), have $global_persistent = 24
    inside          quux(), have $global_persistent = 24
    after   calling quux(), have $global_persistent = 25

Perl variables declared using state are of local scope and are persistent; this means state variables may be accessed only from within the area of code where they were defined, but they do not ever need to be re-declared, and they will retain their values between each execution of their enclosing code block. A state variable declared inside a subroutine will save its value each time the subroutine is executed, providing yet another convenient use of magic in normal Perl. Like our variables, state variables score low for safety, simplicity, and optimizability.

    use strict;
    use feature 'state';
    #print 'before defining quux(), have $local_persistent = ', $local_persistent, "\n";  # YES ERROR
    our void $quux = sub {
        state integer $local_persistent = 23;
        print 'inside quux(), have $local_persistent = ', $local_persistent, "\n";        # NO  ERROR
        $local_persistent++;
    };
    #print 'after  defining quux(), have $local_persistent = ', $local_persistent, "\n";  # YES ERROR

    quux();
    #print 'after   calling quux(), have $local_persistent = ', $local_persistent, "\n";  # YES ERROR
    quux();
    #print 'after   calling quux(), have $local_persistent = ', $local_persistent, "\n";  # YES ERROR

Executing the state variable example code above will generate the following output:

    inside quux(), have $local_persistent = 23
    inside quux(), have $local_persistent = 24

In normal Perl, a variable which is utilized without first being declared will usually undergo the process of "autovivification", which means the variable is automatically created by Perl, according to Perl's own internal rules. Variables which have been autovivified may cause many difficult-to-debug issues, such as misspelled variables taking data without emitting any warnings, etc. Because of this, the use strict; pragma will cause an error to be triggered by any variables which would normally be autovivified. Like our and state variables, autovivified variables receive a low score for safety, simplicity, and speed.

    print 'before defining quux(), have $autovivified = ', $autovivified, "\n";  # NO ERROR, YES WARNING
    our void $quux = sub {
        $autovivified = 23;
        print 'inside         quux(), have $autovivified = ', $autovivified, "\n";       # NO ERROR
        $autovivified++;
    };
    print 'after  defining quux(), have $autovivified = ', $autovivified, "\n";  # NO ERROR, YES WARNING

    quux();
    print 'after   calling quux(), have $autovivified = ', $autovivified, "\n";  # NO ERROR
    quux();
    print 'after   calling quux(), have $autovivified = ', $autovivified, "\n";  # NO ERROR

Executing the autovivified code requires us to first disable use strict;, after which it will run with warnings:

    Use of uninitialized value $autovivified in print
    before defining quux(), have $autovivified = 
    Use of uninitialized value $autovivified in print
    after  defining quux(), have $autovivified = 
    inside          quux(), have $autovivified = 23
    after   calling quux(), have $autovivified = 24
    inside          quux(), have $autovivified = 23
    after   calling quux(), have $autovivified = 24

In RPerl, all variables are created using the my keyword, because it is the best choice: my variables are the least prone to cause errors, they are the easiest to convert into C++ output code, and they are the most optimizable. Also, because all RPerl code must utilize the use strict; pragma, there is no autovivification of RPerl variables. RPerl does not support our variables, state variables, autovivified variables, or any kind other than my variables.

Section 4.5.1: my Intermittent Variables

You may recall we analyzed part of the integer_arrayref_to_string() subroutine in "Section 3.11: Converting From Array To String". This particular subroutine makes use of multiple my variables, the significance of which is discussed after the source code example.

As promised, we will now review the full subroutine, for Perl operations & Perl data types mode:

    # stringify an integer_arrayref
    our string $integer_arrayref_to_string = sub {
        # require exactly one integer_arrayref as input, store in variable $input_avref
        ( my integer_arrayref $input_avref ) = @ARG;

        # declare local variables, av & sv mean "array value" & "scalar value" as used in Perl core
        my integer $input_av_length;
        my integer $input_av_element;
        my string $output_sv;
        my boolean $i_is_0 = 1;

        # compute length of (number of elements in) input array
        $input_av_length = scalar @{$input_avref};

        # begin output string with left-square-bracket, as required for all RPerl arrays
        $output_sv = '[';

        # loop through all valid values of $i for use as index to input array
        for my integer $i ( 0 .. ( $input_av_length - 1 ) ) {
            # retrieve input array's element at index $i
            $input_av_element = $input_avref->[$i];

            # append comma & space to output string for all elements except index 0
            if ($i_is_0) { $i_is_0 = 0; }
            else         { $output_sv .= ', '; }

            # stringify individual integer element, append to output string
            $output_sv .= integer_to_string($input_av_element);
        }

        # end output string with right-square-bracket, as required for all RPerl arrays
        $output_sv .= ']';

        # return output string, containing stringified input array
        return $output_sv;
    };

The above subroutine's name is, unsurprisingly, integer_arrayref_to_string(); it accepts exactly one input operand, an integer_arrayref accessed via the variable $input_avref, and it generates as output a string formed in the variable $output_sv.

In addition to the input operand, the integer_arrayref_to_string() subroutine defines four normal my variables: $input_av_length, $input_av_element, $output_sv, and $i_is_0. Also, the special loop iterator variable $i is declared using the my keyword.

The values of the four normal my variables are accessible only within this subroutine; in other words, they are locally-scoped AKA lexically-scoped variables. Any attempt to access or modify the values of the $input_av_length, $input_av_element, $output_sv, or $i_is_0 variables outside the integer_arrayref_to_string() subroutine will result in an error or undefined behavior.

The value of the special loop iterator $i has an even smaller scope, because it is only valid within the for loop body. The accepted terminology is to say that $i is a "lexical loop iterator" variable. Any attempt to utilize the $i variable outside the loop body code block will result in an error or undefined behavior.

This style of programming, where all variables are created using the my keyword, allows our code to undergo additional optimizations for memory usage, serial runtime performance, and parallel runtime performance. Thus, we always use lexically-scoped variables created using Perl's my keyword, in order to make our RPerl output code run as fast as possible.

The string generated as output of the integer_arrayref_to_string() subroutine is itself valid RPerl source code, and may be copied, pasted, and re-parsed by the RPerl compiler:

    my integer_arrayref $foo;
    $foo = [8, 23, 17];
    print '$foo = ', integer_arrayref_to_string($foo), ';', "\n";

Running the code example above generates the following output string, which is itself an exact replica of the original line of valid RPerl source code:

    $foo = [8, 23, 17];

Section 4.5.2: Persistent State, Pseudo-State Variables

RPerl does not support variables created using the magic state keyword. If you want to maintain a persistent state variable for one of your subroutines, then you may simply use the my keyword to declare a variable before calling the subroutine, pass the variable as a differently-named input argument to the subroutine, access and modify the variable while inside the subroutine, return the variable from the subroutine, receive the returned value back into the original variable, and repeat for as many times as you like. Each time your subroutine runs, it will be able to utilize the persistent state contained within your variable, even though you used the my keyword instead of the state keyword to declare the variable.

We coin the term "pseudo-state variable" to mean a low-magic my variable used to simulate the behavior of a high-magic state variable.

    our integer $foo = sub {
        ( my integer $pseudo_state_in_foo ) = @ARG;
        print 'inside foo(), received $pseudo_state_in_foo = ', $pseudo_state_in_foo, "\n";
        $pseudo_state_in_foo++;
        print 'inside foo(), have modified $pseudo_state_in_foo = ', $pseudo_state_in_foo, "\n";
        return $pseudo_state_in_foo;
    };

    my integer $pseudo_state = 0;

    $pseudo_state = foo($pseudo_state);
    $pseudo_state = foo($pseudo_state);
    $pseudo_state = foo($pseudo_state);
    $pseudo_state = foo($pseudo_state);
    $pseudo_state = foo($pseudo_state);

Outside the subroutine, we declare our variable with the expressive name $pseudo_state as an indication of its purpose, although you may certainly choose to name your variable(s) differently.

Inside the subroutine, we declare our variable with a different name, in this case $pseudo_state_in_foo; by doing so, we remind ourselves these are truly two different variables with different scopes, and we also avoid any possible errors or unexpected behavior. The persistent variable is $pseudo_state, because its value persists after each call to foo() returns. The non-persistent variable is $pseudo_state_in_foo, because its value is accessible only inside each individual call to foo(). The trick to creating a pseudo-state variable is to make a connection between a persistent variable outside of a subroutine and a non-persistent variable inside the subroutine, as we have done in the example code above.

Running the pseudo-state source code example above produces the following output:

    inside foo(), received $pseudo_state_in_foo = 0
    inside foo(), have modified $pseudo_state_in_foo = 1
    inside foo(), received $pseudo_state_in_foo = 1
    inside foo(), have modified $pseudo_state_in_foo = 2
    inside foo(), received $pseudo_state_in_foo = 2
    inside foo(), have modified $pseudo_state_in_foo = 3
    inside foo(), received $pseudo_state_in_foo = 3
    inside foo(), have modified $pseudo_state_in_foo = 4
    inside foo(), received $pseudo_state_in_foo = 4
    inside foo(), have modified $pseudo_state_in_foo = 5

Section 4.6: use strict; use warnings; Pragmas & Magic

We have already been exposed to the use strict; and use warnings; source code components; these are both classified as Perl "pragmata" AKA "pragma" statements, which are special configuration controls used to change the way Perl itself functions. Both strict and warnings pragmas appear in the header section of all RPerl programs:

    #!/usr/bin/env perl

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator

    # [[[ OPERATIONS ]]]
    print 'Howdy, Earth!', "\n";

Each of these pragma statements has multiple effects upon the Perl program for which they are enabled. The most pertinent outcome of the use strict; statement is to disable variable autovivification, which is why there is no strict pragmata in the autovivified example of "Section 4.5: Subroutine Variables, Variable Scope & Persistence".

The use warnings; pragma enables numerous helpful warning messages, which are an invaluable aid to the development of reliable and bug-free Perl software. In the case of warnings generated at compile time, RPerl will automatically cause such warnings to become fatal errors, and will provide as much information as possible to help you fix any issues before you try again.

Utilizing these two pragmata statements is the first step toward disabling all of Perl's unnecessary and unwanted magic, which is cleverly hidden throughout every normal Perl program. This direct use of strict and warnings occurs during RPerl's parse phase 0, which is the "Check Perl Syntax" phase. It is impossible to disable use strict; and use warnings; in RPerl, as they are required to pass RPerl's parse phase 1 AKA "Criticize Perl Syntax", as well as parse phase 2 AKA "Parse RPerl Syntax".

All true Perl Monks harken to the (wise?) words of our own Perl Prophet, The Voice In The Wilderness:

    Avoid Ye Coders The Programming Of Slack
    Use strict & warnings, That Ye Catch No Flak

Section 4.7: Exercises

1. Subroutine To Calculate Total Of Numeric Array Elements [ 45 mins ]

In the same LearningRPerl directory from the the previous chapters' exercises, create a new sub-directory named (you guessed it!) Chapter4. In the new sub-directory, create an RPerl program file named exercise_1-subroutine_total.pl.

Write an RPerl program which includes an array variable named $fred containing the data [1, 3, 5, 7, 9]. Add a subroutine named total() which accepts as input a number_arrayref and returns a number representing the sum total of all array elements. First, call total($fred) and display the results. Next, prompt the user to input as many numbers as they like, separated by <ENTER> and ended by <CTRL-D>, then store the results in an array and display the results of calling total() on the array. Remember to use the number_to_string() or to_string() subroutines.

Your program should generate exactly the following output, when provided with the corresponding input:

    $ rperl LearningRPerl/Chapter4/exercise_1-subroutine_total.pl
    The total of $fred is 25
    Please input zero or more numbers, separated by <ENTER>, ended by <CTRL-D>:
    8
    23
    17
    The total of those numbers is 48

2. Subroutine To Calculate Total Of Numbers 1 Through 1_000 [ 15 mins ]

In the Chapter4 sub-directory, create an RPerl program file named exercise_2-subroutine_total_1000.pl.

Write an RPerl program which includes the same subroutine total() from the previous exercise; you may copy-and-paste your own working source code, or re-write the subroutine from scratch, whichever you prefer. Create an array variable named $one_to_one_thousand, and make use of the range .. operator to fill the variable with consecutive integers starting with 1 and ending with 1_000. Call the subroutine and pass your variable as the input argument, then display the result.

Your program should generate exactly the following output:

    $ rperl -t LearningRPerl/Chapter4/exercise_2-subroutine_total_1000.pl 
    The total of 1 to 1000 is 500_500.

3. Subroutine To Find Above-Average Array Elements [ 45 mins ]

In the Chapter4 sub-directory, create an RPerl program file named exercise_3-subroutine_above_average.pl.

As in the previous exercise, write an RPerl program which includes the same subroutine total() once again. Add a subroutine named average(), which accepts a number_arrayref as input argument and returns a number representing the arithmetic mean of the array elements; call the subroutine total() from within average() as part of calculating the mean value. Next, add another subroutine named above_average(), which also accepts a number_arrayref argument, and which returns a different number_arrayref containing only the input array elements which are above the array's arithmetic mean; call the average() subroutine from within above_average() as part of your algorithm.

Now, create an array variable named $fred, and use the range .. operator to populate the array with the integers 1 through 10. Display all the values in $fred, then call above_average() with $fred as the input argument and display the results.

Lastly, create another array variable named $barney, and fill it with all the same values as $fred, with the addition of the value 100 added as the very first element (index 0) of $barney. As before, display the values in the $barney array, then pass it as the argument to above_average() and display the results.

Include some helpful hard-coded messages, to let your users know if the correct output is being generated.

Your program should generate exactly the following output, when provided with the corresponding input:

    $ rperl -t LearningRPerl/Chapter4/exercise_3-subroutine_above_average.pl 
    $fred is [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    The above-average elements of $fred are [6, 7, 8, 9, 10]
    (Should be [6, 7, 8, 9, 10])

    $barney is [100, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    The above-average elements of $barney are [100]
    (Should be just [100])

4. Subroutine To Greet User & Store Program State [ 30 mins ]

In the Chapter4 sub-directory, create an RPerl program file named exercise_4-subroutine_greet.pl.

Write an RPerl program which includes a subroutine named greet(), which accepts as input two string variables called $name and $previous_name, and which returns a single string variable. The greet() subroutine should say "Hi" to the person whose name has been passed in as an argument; if $previous_name is an empty string then tell the person they are the first one, otherwise tell them who was the previous person to be greeted. Return the name of the person you have just greeted.

Outside the subroutine, create another different (although identically-named) string variable called $previous_name, and set its initial value to equal the empty string q{}. Call the greet() subroutine 4 times in a row with the following input names: 'Fred', 'Barney', 'Wilma', and 'Betty'. Store the return value of each subroutine call back into the $previous_name variable, and pass it as the second argument to each subsequent subroutine call.

Your program should generate exactly the following output, when provided with the specified names:

    $ rperl -t LearningRPerl/Chapter4/exercise_4-subroutine_greet.pl 
    Hi Fred!  You are the first one here!
    Hi Barney!  Fred is also here!
    Hi Wilma!  Barney is also here!
    Hi Betty!  Wilma is also here!

5. Subroutine To Greet User & Store More Program State [ 35 mins ]

In the Chapter4 sub-directory, create an RPerl program file named exercise_5-subroutine_greet_multiple.pl.

As in the previous exercise, write an RPerl program which includes a new subroutine, also named greet(). This version of greet() accepts as input a string argument called $name, as well as a string_arrayref argument called $previous_names. This greet() subroutine works similarly to the previous exercise, except it is capable of displaying multiple previous names, and it returns the $previous_names array updated to include the latest person's $name.

Your program should generate exactly the following output, when provided with the same names as the previous exercise:

    $ rperl -t LearningRPerl/Chapter4/exercise_5-subroutine_greet_multiple.pl 
    Hi Fred!  You are the first one here!
    Hi Barney!  I've seen: Fred
    Hi Wilma!  I've seen: Fred Barney
    Hi Betty!  I've seen: Fred Barney Wilma


CHAPTER 5: READING & WRITING FILES


CHAPTER 6: HASH VALUES & VARIABLES

A hash may not be a constant.

types() introspection subroutine


CHAPTER 7: THE REGULAR EXPRESSION SUB-LANGUAGE


CHAPTER 8: MATCHING BY REGULAR EXPRESSIONS


CHAPTER 9: PROCESSING BY REGULAR EXPRESSIONS


CHAPTER 10: ADDITIONAL CONTROL STRUCTURES


CHAPTER 11: CLASSES, PACKAGES, MODULES, LIBRARIES


CHAPTER 12: TESTING FILES & DIRECTORIES

Section 12.x.x: File Test Operators

    %token OP10_NAMED_UNARY_SCOLON   = /(-A;|-B;|-C;|-M;|-O;|-R;|-S;|-T;|-W;|-X;|-b;|-c;|-d;|-e;|-f;|-g;|-k;|-l;|-o;|-p;|-r;|-s;|-t;|-u;|-w;|-x;|-z;)/


CHAPTER 13: MANIPULATING FILES & DIRECTORIES


CHAPTER 14: SORTING TEXT VALUES


CHAPTER 15: ADDITIONAL SPECIAL OPERATORS


CHAPTER 16: MANAGING OPERATING SYSTEM PROCESSES


CHAPTER 17: ADDITIONAL ADVANCED TECHNIQUES

Section 17.1: SSE Data Structure & Operators

    # OP08_MATH_ADD_SUB = /(sse_add|sse_sub)/    # precedence 08 infix: SSE add 'sse_add', SSE subtract 'sse_sub'
    # OP07_MATH_MULT_DIV_MOD = /(sse_mul|sse_div)/  # precedence 07 infix: SSE multiply 'sse_mul', SSE divide 'sse_div'

Section 17.2: Misc & Unsorted Operators

    # REGEX: %token OP01_NAMED_SCOLON = /(m;|pos;|qr;|s;|study;|tr;|y;

    # FORMAT: %token OP01_NAMED_SCOLON = /(format;|formline;|write;

    # PACK / UNPACK: %token OP01_NAMED_SCOLON = /(pack;|unpack;|vec;

    # ADDITIONAL UNSORTED OPERATORS, HIDDEN FROM VIEW BELOW

Section 17.3: Type Introspection

types() introspection subroutine


APPENDIX A: EXERCISE ANSWERS

Chapter 1, Exercise 1

This exercise is commonly used as the first task for new programmers, or for programmers who are learning a new language.
The goal of this exercise is to become familiar with the boilerplate (often-repeated template) RPerl HEADER and CRITICS code sections, as well as the basic print operator.
The first line, starting with #! and called a "shebang", tells the operating system to run this program using Perl.
The 4 lines in the HEADER section tell Perl to run this program using RPerl, and the $VERSION number may be incremented for each update to this file.
The line in the CRITICS section, starting with ##, tells RPerl to allow hard-coded numeric values (not used in this program), as well as the print operator.
The last line, in the OPERATIONS section, calls the print operator on a comma-separated list consisting of 2 string literals, which will simply display the text "Hello, World!" followed by a newline character.
All other lines beginning with # are comments and, along with blank lines, may be safely ignored or removed.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 1, Exercise 1
    # Print "Hello, World!"; the classic first program for new programmers

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator

    # [[[ OPERATIONS ]]]
    print 'Hello, World!', "\n";

Example execution and output:

    $ rperl -t LearningRPerl/Chapter1/exercise_1-hello_world.pl
    Hello, World!


Chapter 1, Exercise 2

The goal of this exercise is to become familiar with the `rperl` command.

Example execution and output for 2a and 2b:

    $ rperl -?
        Usage:
                    rperl [ARGUMENTS] input_program_0.pl [input_program_1.pl input_program_2.pl ...]
                    rperl [ARGUMENTS] MyClassFoo.pm [MyClassBar.pm MyClassBat.pm ...]
                    rperl [ARGUMENTS] input_program_0.pl MyClassFoo.pm [input_program_1.pl ... MyClassBar.pm ...]

        Arguments:
            --help ...OR... -h ...OR... -?
                    Print this (relatively) brief help message for command-line usage.
                    For additional explanations, run the command `perldoc RPerl::Learning` and see Appendix B.

            --version ...OR... -v
            --vversion ...OR... -vv
                Print version number and copyright information.
                    Repeat as 'vv' for more technical information, similar to `perl -V` configuration summary argument.
                    Lowercase 'v' not to be confused with uppercase 'V' in 'Verbose' argument.

            --infile=MyFile.pm ...OR... -i=MyFile.pm
                    Specify input file, may be repeated for multiple input files.
                Argument prefix '--infile' may be entirely omitted.
                    Argument prefix MUST be omitted to specify wildcard for multiple input files.

            --outfile=MyCompiledModule ...OR... -o=MyCompiledModule
        --outfile=my_compiled_program ...OR... -o=my_compiled_program
                    Specify output file prefix, may be repeated for multiple output files.
                    RPerl *.pm input file with PERL ops will create MyCompiledModule.pmc output file.
                    RPerl *.pl input file with PERL ops will create my_compiled_program (or my_compiled_program.exe on Windows) output file.
                    RPerl *.pm input file with CPP  ops will create MyCompiledModule.pmc, MyCompiledModule.cpp, & MyCompiledModule.h output files.
                    RPerl *.pl input file with CPP  ops will create my_compiled_program (or my_compiled_program.exe on Windows) & my_compiled_program.cpp output files.
                Argument may be entirely omitted, foo.* input file will default to foo.* output file(s).

            --CXX=/path/to/compiler
                    Specify path to C++ compiler for use in subcompile modes, equivalent to '--mode CXX=/path/to/compiler' or 'CXX' manual Makefile argument, 'g++' by default.

            --mode magic=LOW ...OR... -m magic=LOW
            --mode magic=MEDIUM ...OR... -m magic=MEDIUM
            --mode magic=HIGH ...OR... -m magic=HIGH
                Specify magic mode, LOW by default.
                    If set to LOW, accept low-magic (static) Perl source code in the source code input file(s).
                    If set to MEDIUM, accept medium-magic (mostly static) Perl source code in the source code input file(s).
                    If set to HIGH, accept high-magic (dynamic) Perl source code in the source code input file(s).
                    Because only low-magic mode is supported at this time, this option does not currently have any effect.

            --mode code=PERL ...OR... -m code=PERL
            --mode code=CPP ...OR... -m code=CPP
                    Specify source code mode, CPP by default.
                    If set to PERL, generate Perl source code in the source code output file(s).
                    If set to CPP, generate C++ source code in the source code output file(s).
                    PERL operations mode forces PERL code mode; CPP operations mode forces CPP code mode.
                    Because code mode is dependent upon operations mode, this option does not currently have any effect.

            --mode ops=PERL ...OR... -m ops=PERL
            --mode ops=CPP ...OR... -m ops=CPP
                    Specify operations mode, CPP by default.
                    If set to PERL, generate Perl operations in the source code output file(s).
                    If set to CPP, generate C++ operations in the source code output file(s).
                    PERL ops mode forces PERL types mode & PARSE or GENERATE compile mode; PERLOPS_PERLTYPES is test mode, does not actually compile.

            --mode types=PERL ...OR... -m types=PERL
            --mode types=CPP ...OR... -m types=CPP
            --mode types=DUAL ...OR... -m types=DUAL
                    Specify data types mode, CPP by default.
                    If set to PERL, generate Perl data types in the source code output file(s).
                    If set to CPP, generate C++ data types in the source code output file(s).
                If set to DUAL, generate both Perl and C++ data types in the source code output file(s).
                    DUAL mode allows generate-once-compile-many types, selected by '#define __FOO__TYPES' in lib/rperltypes_mode.h or `gcc -D__FOO__TYPES` manual subcompile argument.

    [[[ REMAINING ARGUMENTS OMITTED FOR BREVITY ]]]

Example execution and output for 2c and 2d:

    $ rperl -t -V LearningRPerl/Chapter1/exercise_1-hello_world.pl 
    Verbose Flag:       1
    Debug Flag:         0
    Test Flag:          1
    Input File:         LearningRPerl/Chapter1/exercise_1-hello_world.pl
    Output File(s):     LearningRPerl/Chapter1/exercise_1-hello_world  
    Modes:              ops => PERL, types => PERL, check => TRACE, compile => GENERATE, execute => ON, label => ON

    DEPENDENCIES:       Follow & find all deps...    0 found.
    PARSE PHASE 0:      Check     Perl syntax...        done.
    PARSE PHASE 1:      Criticize Perl syntax...        done.
    PARSE PHASE 2:      Parse    RPerl syntax...        done.
    GENERATE:           Generate RPerl syntax...        done.
    EXECUTE:            Run code...

    Hello, World!


    $ rperl -t -D LearningRPerl/Chapter1/exercise_1-hello_world.pl 
    in rperl, have $RPerl::DEBUG = 1
    in rperl, have $RPerl::VERBOSE = 0
    Hello, World!


    $ rperl -t -V -D LearningRPerl/Chapter1/exercise_1-hello_world.pl 
    Verbose Flag:       1
    Debug Flag:         1
    Test Flag:          1

    in rperl, have $RPerl::DEBUG = 1
    in rperl, have $RPerl::VERBOSE = 1
    Input File:         LearningRPerl/Chapter1/exercise_1-hello_world.pl
    Output File(s):     LearningRPerl/Chapter1/exercise_1-hello_world  
    Modes:              ops => PERL, types => PERL, check => TRACE, compile => GENERATE, execute => ON, label => ON

    DEPENDENCIES:       Follow & find all deps...    0 found.
    PARSE PHASE 0:      Check     Perl syntax...        done.
    PARSE PHASE 1:      Criticize Perl syntax...        done.
    PARSE PHASE 2:      Parse    RPerl syntax...        done.
    GENERATE:           Generate RPerl syntax...        done.
    EXECUTE:            Run code...

    Hello, World!

Of the above 3 commands executed for 2c and 2d, the first includes normal output plus additional verbose output; the second includes normal output plus additional debugging output (minimal in this simple case); and the third includes normal output plus both verbose and debugging output.


Chapter 1, Exercise 3

The goal of this exercise is to become familiar with basic variables and arithmetic.
The shebang line, HEADER section, and first line in the CRITICS section are the same boilerplate as the previous exercise.
The second line in the CRITICS section tells RPerl to allow the $ (dollar) character, among others, to be displayed using the print operator.
The first 3 lines in the OPERATIONS section each declare a new variable; $foo and $bar each hold an integer (non-floating-point) numeric value, while $baz holds a number (floating-point) value.
The $foo and $bar variables receive their values from hard-coded numeric values being operated upon by the addition + and multiplication * operators, respectively; the $baz and $zab variables receive their values from the the $foo and $bar variables being operated upon by the division / operator. The to_number() type conversion subroutine is necessary for the C++ compiler to use floating-point arithmetic instead of integer arithmetic on the division / operators, without it we would receive the truncated integer value of 58 in $baz, for example. We will receive the correct floating-point answers if we call to_number() on either $bar (as shown), or $foo, or both; in other words, at least one of the input operands for each operator must be passed through to_number().
The last 4 lines call the print operator to display the names of each variable; followed by each variable's respective value, converted from number to underscore-formatted string via the RPerl type conversion subroutine to_string(); followed by a newline character.

    #!/usr/bin/env perl
    
    # Learning RPerl, Chapter 1, Exercise 3
    # Foo Bar Arithmetic Example
    
    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;
    
    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    
    # [[[ OPERATIONS ]]]
    my integer $foo = 21 + 12;
    my integer $bar = 23 * 42;
    my number  $baz = to_number($bar) / $foo;
    my number  $zab = to_number($foo) / $bar;
    print 'have $foo = ', to_string($foo), "\n";
    print 'have $bar = ', to_string($bar), "\n";
    print 'have $baz = ', to_string($baz), "\n";
    print 'have $zab = ', to_string($zab), "\n";

Example execution and output:

    $ rperl -t LearningRPerl/Chapter1/exercise_3-foo_bar_arithmetic.pl 
    have $foo = 33
    have $bar = 966
    have $baz = 29.272_727_272_727_3
    have $zab = 0.034_161_490_683_229_8


Chapter 2, Exercise 1

The goal of this exercise is to become familiar with constant values.
The first line in the CRITICS section tells RPerl to allow hard-coded numeric values as well as the print operator, both of which are utilized in this program.
The second line in the CRITICS section tells RPerl to allow the use constant operation.
The line in the CONSTANTS section declares a number (floating-point) constant value named PI, automatically accessible via a subroutine named PI().
The inner type variable $TYPED_PI is only used for RPerl parsing purposes; for example, if your constant is named FOO then you should declare it using the inner type variable $TYPED_FOO and you should access it by calling the subroutine FOO(), but you should never directly utilize the variable $TYPED_FOO anywhere else in your code.
The first 2 lines in the OPERATIONS section each create a new number variable, with $radius set to the hard-coded value of 12.5 and $circumference set to the well-known basic geometry formula "circumference equals 2 pi times radius".
The last 3 lines call the print operator to display the values of PI, $radius, and $circumference, each followed by a newline character.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 1
    # Find the circumference of a circle with hard-coded radius of 12.5 units

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitConstantPragma ProhibitMagicNumbers)  # USER DEFAULT 3: allow constants

    # [[[ CONSTANTS ]]]
    use constant PI => my number $TYPED_PI = 3.141_592_654;

    # [[[ OPERATIONS ]]]
    my number $radius = 12.5;
    my number $circumference = 2 * PI() * $radius;

    print 'Pi = ', to_string(PI()), "\n";
    print 'Radius = ', to_string($radius), "\n";
    print 'Circumference = 2 * Pi * Radius = 2 * ', to_string(PI()), ' * ', to_string($radius), ' = ', to_string($circumference), "\n";

Example execution and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_1-circumference_of_specific_radius.pl 
    Pi = 3.141_592_654
    Radius = 12.5
    Circumference = 2 * Pi * Radius = 2 * 3.141_592_654 * 12.5 = 78.539_816_35


Chapter 2, Exercise 2

The goal of this exercise is to become familiar with accepting user keyboard input.
The third line in the CRITICS section tells RPerl to allow user input via the <STDIN> standard stream, a software input connection which is attached to the keyboard by default.
The second line in the OPERATIONS section creates a string variable $radius_string, and assigns to it the text value typed by the user on their keyboard.
The third line in the OPERATIONS section creates a number variable $radius, and assigns to it the numeric value returned by calling the RPerl type conversion subroutine string_to_number() on the string variable $radius_string.
This exercise is otherwise identical to the previous exercise.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 2
    # Find the circumference of a circle with any radius entered by the user

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitConstantPragma ProhibitMagicNumbers)  # USER DEFAULT 3: allow constants
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ CONSTANTS ]]]
    use constant PI => my number $TYPED_PI = 3.141_592_654;

    # [[[ OPERATIONS ]]]
    print 'Please input radius: ';
    my string $radius_string = <STDIN>;
    my number $radius = string_to_number($radius_string);
    my number $circumference = 2 * PI() * $radius;

    print "\n";
    print 'Pi = ', to_string(PI()), "\n";
    print 'Radius = ', to_string($radius), "\n";
    print 'Circumference = 2 * Pi * Radius = 2 * ', to_string(PI()), ' * ', to_string($radius), ' = ', to_string($circumference), "\n";

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_2-circumference_of_any_radius.pl 
    Please input radius: 2

    Pi = 3.141_592_654
    Radius = 2
    Circumference = 2 * Pi * Radius = 2 * 3.141_592_654 * 2 = 12.566_370_616


Chapter 2, Exercise 3

The goal of this exercise is to become familiar with conditional statements and comparison operators.
In the OPERATIONS section, the line starting with if ($radius >= 0) denotes the beginning of a conditional statement: if the numeric value of the variable $radius is greater-than-or-equal-to 0, then the normal calculation for $circumference is used; if $radius is less-than 0 (physically impossible), then a warning message is printed and $circumference is set to 0.
This exercise is otherwise identical to the previous exercise.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 3
    # Find the circumference of a circle with any positive radius entered by the user

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitConstantPragma ProhibitMagicNumbers)  # USER DEFAULT 3: allow constants
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ CONSTANTS ]]]
    use constant PI => my number $TYPED_PI = 3.141_592_654;

    # [[[ OPERATIONS ]]]
    print 'Please input radius: ';
    my string $radius_string = <STDIN>;
    my number $radius = string_to_number($radius_string);
    my number $circumference;

    if ($radius >= 0) {
        $circumference = 2 * PI() * $radius;
    }
    else {
        print 'Negative radius detected, defaulting to zero circumference!', "\n";
        $circumference = 0;
    }

    print "\n";
    print 'Pi = ', to_string(PI()), "\n";
    print 'Radius = ', to_string($radius), "\n";
    print 'Circumference = 2 * Pi * Radius = 2 * ', to_string(PI()), ' * ', to_string($radius), ' = ', to_string($circumference), "\n";

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_3-circumference_of_any_positive_radius.pl 
    Please input radius: -2
    Negative radius detected, defaulting to zero circumference!

    Pi = 3.141_592_654
    Radius = -2
    Circumference = 2 * Pi * Radius = 2 * 3.141_592_654 * -2 = 0


Chapter 2, Exercise 4

The goal of this exercise is to gain further exposure to the <STDIN> standard stream and variable multiplication.
In the OPERATIONS section, <STDIN> is accessed to collect user input for both the $multiplicator_string and $multiplicand_string variables.
These 2 string variables are converted from text values to numeric values by calling string_to_number(), then multiplied via the * multiplication operator, and the results displayed by calling print.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 4
    # Find the product of any two numbers entered by the user

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]
    print 'Please input multiplicator: ';
    my string $multiplicator_string = <STDIN>;
    my number $multiplicator = string_to_number($multiplicator_string);

    print 'Please input multiplicand: ';
    my string $multiplicand_string = <STDIN>;
    my number $multiplicand = string_to_number($multiplicand_string);

    my number $product = $multiplicator * $multiplicand;

    print "\n";
    print 'Product = Multiplicator * Multiplicand = ', to_string($multiplicator), ' * ', to_string($multiplicand), ' = ', to_string($product), "\n";

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_4-product_of_any_two_numbers.pl 
    Please input multiplicator: 2112
    Please input multiplicand: 23.42

    Product = Multiplicator * Multiplicand = 2_112 * 23.42 = 49_463.04


Chapter 2, Exercise 5

The goal of this exercise is to become familiar with the x string repeat operator.
In the OPERATIONS section, <STDIN> is accessed twice to collect user input for a string variable $repeat_string to be repeated, and an integer variable $repeat_integer to specify the number of repetitions.
The last line contains 2 operators, print and the x string repeat operator.
Like the . (single dot) string concatenation operator, the x operator has a higher precedence than print and is thus executed first, generating a single string value comprised of the original string repeated 0 or more times, then the resulting string is displayed by calling print.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 5
    # Repeat any string any number of times, both values entered by the user

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]
    print 'Please input string to be repeated: ';
    my string $repeat_string = <STDIN>;

    print 'Please input integer (whole number) times to repeat string: ';
    my string $repeat_integer_string = <STDIN>;
    my integer $repeat_integer = string_to_integer($repeat_integer_string);

    print "\n";
    print $repeat_string x $repeat_integer;

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_5-string_repeat.pl 
    Please input string to be repeated: howdy
    Please input integer (whole number) times to repeat string: 3

    howdy
    howdy
    howdy


Chapter 2, Exercise 6

The goal of this exercise is to become familiar with the while loop control structure.
In the OPERATIONS section, <STDIN> is accessed to collect user input into the string variable $n_string, then the data type conversion subroutine string_to_integer() is called to create the corresponding integer variable $n.
The if ($n < 0) conditional control structure calls die to exit with an error message if $n is not a positive value.
Two additional integer variables are created: $sum is initialized to 0 and will eventually hold the final answer value; and $i is initialized to 0 for use as the iteration counter variable (AKA "iterator") inside the following while loop control structure.
The while loop iteratively executes as long as the integer value of $i is less-than-or-equal-to $n. Each iteration of the while loop causes the value of $sum to be increased by $i, after which $i itself is incremented by 1, and the loop is repeated.
The last line calls print, the to_string() conversion subroutine, and the . (single dot) string concatenation operator to display the final answer.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 6
    # Calculate the sum of the first n integers; 1 + 2 + 3 + ... + n = ?

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]
    print 'Please input a positive integer: ';
    my string $n_string = <STDIN>;
    my integer $n = string_to_integer($n_string);

    if ($n < 0) { die 'ERROR: ' . to_string($n) . ' is not positive, dying' . "\n"; }

    my integer $sum = 0;
    my integer $i = 1;

    while ($i <= $n) {
        $sum += $i;
        $i++;
    }

    print 'The sum of the first ', to_string($n), ' integers is ', to_string($sum), "\n";

Example executions, input, and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_6-sum_of_first_n_integers.pl 
    Please input a positive integer: 100
    The sum of the first 100 integers is 5_050

    $ rperl -t LearningRPerl/Chapter2/exercise_6-sum_of_first_n_integers.pl 
    Please input a positive integer: -100
    ERROR: -100 is not positive, dying


Chapter 2, Exercise 7

The goal of this exercise is to utilize an optimization to remove the while loop from the previous exercise.
In the OPERATIONS section, user input is collected and error-checked, as before.
By utilizing an algorithm attributed to Gauss, we are able to remove the while loop, so we also remove the $i loop iterator. The $sum variable is kept the same as the previous exercise.
Two new integer variables are also created: $n_original is initialized to hold a copy of $n which will never change; and $n_odd is initialized to 0.
The code block beginning with if ($n % 2) is used to detect if the user provided a value for $n which is odd (not even), this is detected by utilizing the % modulo operator to test for a remainder of dividing $n by 2. If $n is odd, then a second copy of the original value of $n is stored in $n_odd, after which $n itself is decremented by 1. This if statement allows our algorithm to accept odd as well as even user input values for $n.
The computational kernel (most important part) is the one line of arithmetic: $sum = (($n + 1) * ($n / 2)) + $n_odd;
Recall the story of Gauss's algorithm in the exercise description; our arithmetic generalizes the idea of young Gauss by using $n instead of 100, and by adding $n_odd to enable support for odd values of $n. Using Gauss' own example of $n equal to 100, which is even (not odd) so $n_odd will equal 0, we can see the algorithm becomes: $sum = ((100 + 1) * (100 / 2)) + 0;
One more step of arithmetic simplification shows our algorithm to be the same as Gauss': $sum = 101 * 50;
The last line calls print using the value of $n_original, because the value of $n will have been decreased by 1 if $n is odd.
It is usually faster to run an algorithm without a loop than with a loop, because you are only doing 1 thing instead of many things. However, not all problems can be easily optimized by changing to a new algorithm, because it may be prohibitively complex or there may only be 1 known algorithm. In this case, Gauss' algorithm should be faster than the while loop algorithm, expecially for very large values of $n. It may be possible to further optimize this exercise by utilizing bitwise operators to replace the modulo and division operators.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 2, Exercise 7
    # Calculate the sum of the first n integers, without using a loop; 1 + 2 + 3 + ... + n = ?

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]
    print 'Please input a positive integer: ';
    my string $n_string = <STDIN>;
    my integer $n = string_to_integer($n_string);

    if ($n < 0) { die 'ERROR: ' . to_string($n) . ' is not positive, dying' . "\n"; }

    my integer $sum = 0;
    my integer $n_original = $n;
    my integer $n_odd = 0;

    if ($n % 2) {
        $n_odd = $n;
        $n--;
    }

    $sum = (($n + 1) * ($n / 2)) + $n_odd;

    print 'The sum of the first ', to_string($n_original), ' integers is ', to_string($sum), "\n";

Example executions, input, and output:

    $ rperl -t LearningRPerl/Chapter2/exercise_7-sum_of_first_n_integers_no_loop.pl 
    Please input a positive integer: 100
    The sum of the first 100 integers is 5_050

    $ rperl -t LearningRPerl/Chapter2/exercise_7-sum_of_first_n_integers_no_loop.pl 
    Please input a positive integer: -100
    ERROR: -100 is not positive, dying


Chapter 3, Exercise 1

The goal of this exercise is to become familiar with the foreach loop control structure, arrays, and array operators; and to become further familiarized with the while loop.
The first line in the OPERATIONS section declares a new variable $input_strings of type string_arrayref, which is capable of storing 0 or more individual string values, and $input_strings is then initialized to contain the empty set [].
The line starting with while (my string $input_string = <STDIN>) denotes the beginning of an iterative (repeating) loop statement, which continues to accept and store user input until CTRL-D is pressed to indicate the EOF (end-of-file) condition, also known as EOT (end-of-transmission).
A new copy of the variable $input_string is created and assigned the value of collected user input by calling <STDIN> at the start of each loop iteration; the my operator is evaluated as a true condition and the loop repeats, until CTRL-D is received and the my operator returns a false condition.
Inside the body of the while loop is 1 line calling the push operator, which appends the current iteration's value of $input_string onto the list of strings contained in $input_strings.
The at-sign-curly-braces @{ } is the array dereference operator, which exists because in Perl it is still sometimes necessary to directly access an array by value, instead of the RPerl method of indirectly accessing the array by reference, such as is required by the push operator.
The line starting with my string_arrayref $input_strings_reversed declares another array of string values input_strings_reversed, and then assigns it the strings contained within the first array $input_strings in reversed order, as returned by calling the reverse operator.
As with the push operator, the reverse operator requires its argument to be dereferenced using at-sign-curly-braces @{ }; another dereferenced array value is returned by reverse, and an array reference is returned by enclosing reverse and its argument inside the square-brackets [ ] array reference operator.
Finally, the line starting with foreach my string $input_strings_reversed_element denotes the beginning of another loop statement, which iterates the value of $input_strings_reversed_element once for each string value contained in the $input_strings_reversed array; print is called inside the loop body to display the original input strings in reverse order.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 3, Exercise 1
    # Print user-supplied list of strings in reverse order

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]
    my string_arrayref $input_strings = [];

    print 'Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:', "\n";

    while (my string $input_string = <STDIN>) {
        push @{$input_strings}, $input_string;
    }

    print "\n";
    print 'Strings in reverse order:', "\n";

    my string_arrayref $input_strings_reversed = [reverse @{$input_strings}];

    foreach my string $input_strings_reversed_element (@{$input_strings_reversed}) {
        print $input_strings_reversed_element;
    }

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter3/exercise_1-stdin_strings_reverse.pl 
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    howdy
    doody
    buffalo
    bob
    clarabell
    clown

    Strings in reverse order:
    clown
    clarabell
    bob
    buffalo
    doody
    howdy


Chapter 3, Exercise 2

The goal of this exercise is to become familiar with utilizing array indices.
The first line in the OPERATIONS section declares a new variable $flintstones_and_rubbles of type string_arrayref, which is then initialized to contain a qw() (quote word) set of names including the string element 'fred' at array index 0, 'betty' at array index 1, and so on.
The second line in OPERATIONS creates an empty array of integers $input_indices, and the following while loop uses the push operator to fill $input_indices with integers entered by the user.
The foreach loop iterates through all the user input integers, placing each in the variable $input_index.
Inside the foreach loop, the variable $input_index is used to access the individual names inside the variable $flintstones_and_rubbles.
In Perl, all array indices start at 0 instead of 1, so we must first subtract 1 from $input_index before accessing the individual string elements of $flintstones_and_rubbles.
Thus, if a user inputs the integer 1, the array index will be 0, which is 'fred'; similarly, user input 5 will access array index 4 which is 'wilma'.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 3, Exercise 2
    # Print string array elements indexed by user-supplied integers

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ OPERATIONS ]]]
    my string_arrayref $flintstones_and_rubbles = [qw(fred betty barney dino wilma pebbles bamm-bamm)];
    my integer_arrayref $input_indices          = [];

    print 'Please input zero or more integers with values ranging from 1 to 7, separated by <ENTER>, ended by <CTRL-D>:', "\n";

    while ( my string $input_string = <STDIN> ) {
        push @{$input_indices}, string_to_integer($input_string);
    }

    print "\n";
    print 'Flintstones & Rubbles:', "\n";

    foreach my integer $input_index ( @{$input_indices} ) {
        print $flintstones_and_rubbles->[ ( $input_index - 1 ) ], "\n";
    }

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter3/exercise_2-stdin_array_indices.pl 
    Please input zero or more integers with values ranging from 1 to 7, separated by <ENTER>, ended by <CTRL-D>:
    2
    5
    3
    6
    5
    1
    2
    7
    4
    4

    Flintstones & Rubbles:
    betty
    wilma
    barney
    pebbles
    wilma
    fred
    betty
    bamm-bamm
    dino
    dino


Chapter 3, Exercise 3

The goal of this exercise is to become familiar with the sort and chomp operators.
In the CONSTANTS section, a constant SINGLE_LINE_OUTPUT is created and set to 0, which will cause the program's output to be displayed on multiple lines. If SINGLE_LINE_OUTPUT is instead set to 1, the program's output will be displayed on one line.
In the OPERATIONS section, a while loop is used to fill the array variable $input_strings with data supplied by the user.
The sort operator is then called to fill the $input_strings_sorted array with the same data in ASCII alphabetical order.
Last, a foreach loop is used to print the sorted output by storing and accessing one string at a time in the variable $input_strings_sorted_element.
An if conditional statement inside the foreach loop tests the constant SINGLE_LINE_OUTPUT; if the condition is true, then the chomp operator is called to remove the newline character from each string element and replace it with a normal blank space character, thereby displaying all the elements on a single line of output.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 3, Exercise 3
    # Print user-supplied list of strings in ASCIIbetical order, optionally on single line of output

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitConstantPragma ProhibitMagicNumbers)  # USER DEFAULT 3: allow constants
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ CONSTANTS ]]]
    use constant SINGLE_LINE_OUTPUT => my boolean $TYPED_SINGLE_LINE_OUTPUT = 0;

    # [[[ OPERATIONS ]]]
    my string_arrayref $input_strings = [];

    print 'Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:', "\n";

    while ( my string $input_string = <STDIN> ) {
        push @{$input_strings}, $input_string;
    }

    print "\n";
    print 'Strings in ASCIIbetical order:', "\n";

    my string_arrayref $input_strings_sorted = [ sort @{$input_strings} ];

    foreach my string $input_strings_sorted_element ( @{$input_strings_sorted} ) {
        if ( SINGLE_LINE_OUTPUT() ) {

            # strip trailing newline, if present
            chomp $input_strings_sorted_element;
            $input_strings_sorted_element .= q{ };
        }

        print $input_strings_sorted_element;
    }

    print "\n";

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter3/exercise_3-stdin_strings_sort.pl 
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    howdy
    doody
    buffalo
    bob
    clarabell
    clown

    Strings in ASCIIbetical order:
    bob
    buffalo
    clarabell
    clown
    doody
    howdy


Chapter 4, Exercise 1

The goal of this exercise is to become familiar with user-defined subroutines.
In the SUBROUTINES section, one subroutine total() is defined, which accepts as input one argument variable $input_numbers of type number_arrayref, and which also has a number return value stored in the variable $retval.
Inside the total() subroutine, the return value is initialized to 0 in $retval, then a foreach loop iteratively adds the elements of the array $input_numbers to $retval, after which the value of $retval is returned to original external caller of total().
By itself, a subroutine such as total() does not actually do anything; every subroutine must first be called either by some other subroutine or in the OPERATIONS section.
In OPERATIONS, a 5-element array is created and stored in the variable $fred, which is then passed as input to the subroutine total(), and the return value is displayed using the variable $fred_total and the print operator.
Next, a while loop and <STDIN> are used to collect user input strings, which are then converted to numeric data values using the string_to_number() subroutine, and stored in the array $input_numbers using the push operator.
Finally, the subroutine total() is called a second time, now with the variable $input_numbers passed as the input argument, and the return value is displayed using the variable $user_total.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 4, Exercise 1
    # Subroutine & driver to calculate the totals of arrays of stringified numbers, both hard-coded and user-supplied

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ SUBROUTINES ]]]

    our number $total = sub {
        (my number_arrayref $input_numbers) = @ARG;
        my number $retval = 0;
        foreach my number $input_number (@{$input_numbers}) {
            $retval += $input_number;
        }
        return $retval;
    };

    # [[[ OPERATIONS ]]]

    my number_arrayref $fred = [1, 3, 5, 7, 9];
    my number $fred_total = total($fred);
    print 'The total of $fred is ', to_string($fred_total), "\n";

    print 'Please input zero or more numbers, separated by <ENTER>, ended by <CTRL-D>:', "\n";

    my number_arrayref $input_numbers = [];
    while (my string $input_string = <STDIN>) {
        push @{$input_numbers}, string_to_number($input_string);
    }

    my number $user_total = total($input_numbers);
    print 'The total of those numbers is ', to_string($user_total), "\n";

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter4/exercise_1-subroutine_total.pl 
    The total of $fred is 25
    Please input zero or more numbers, separated by <ENTER>, ended by <CTRL-D>:
    21.12
    23.42
    1701.877
    -123.456
    The total of those numbers is 1_622.961


Chapter 4, Exercise 2

The goal of this exercise is to become familiar with the .. range operator.
In the SUBROUTINES section, the same subroutine total() is defined as in the previous exercise.
In the OPERATIONS section, an array of numbers is created and stored in the variable $one_to_one_thousand.
Upon execution, there will be 1,000 number elements in the array, which are automatically created by the .. range operator.
The actual elements stored the array variable $one_to_one_thousand start with [1, 2, 3, 4 and end with 997, 998, 999, 1_000].
Finally, the subroutine total() is called with $one_to_one_thousand passed as the input argument, and the return value is displayed using the variable $one_to_one_thousand_total.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 4, Exercise 2
    # Subroutine & driver to calculate the total of 1 to 1,000

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator

    # [[[ SUBROUTINES ]]]

    our number $total = sub {
        ( my number_arrayref $input_numbers ) = @ARG;
        my number $retval = 0;
        foreach my number $input_number ( @{$input_numbers} ) {
            $retval += $input_number;
        }
        return $retval;
    };

    # [[[ OPERATIONS ]]]

    my number_arrayref $one_to_one_thousand = [ 1 .. 1_000 ];
    my number $one_to_one_thousand_total    = total($one_to_one_thousand);
    print 'The total of 1 to 1_000 is ', to_string($one_to_one_thousand_total), q{.}, "\n";

Example execution and output:

    $ rperl -t LearningRPerl/Chapter4/exercise_2-subroutine_total_1000.pl 
    The total of 1 to 1000 is 500_500.


Chapter 4, Exercise 3

The goal of this exercise is to become familiar with calling user-defined subroutines from one another.
In the SUBROUTINES section, 3 subroutines are defined: total() (same as previous exercises), average(), and above_average().
Inside the subroutine average() is a call to the subroutine total(); there is also a call to the scalar operator, which returns the count of elements inside the array $input_numbers. When the return value of total() is divided by that of scalar, the result is computation of the numeric mean (average) of all elements of $input_numbers.
Inside above_average() is a call to the subroutine average(), with the return value stored in the variable $average. An empty array is created in the variable $retval, then a foreach loop iterates over all elements in $input_numbers and an if conditional statement makes a copy of all elements which are greater-than $average. All above-average elements are returned as an array in $retval.
In the OPERATIONS section, 2 arrays are created in the $fred and $barney variables, which are then passed as input arguments to 2 calls to the subroutine above_average(), and the results are displayed.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 4, Exercise 3
    # Subroutines & driver to calculate the above-average elements of hard-coded arrays

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils

    # [[[ SUBROUTINES ]]]

    our number $total = sub {
        ( my number_arrayref $input_numbers ) = @ARG;
        my number $retval = 0;
        foreach my number $input_number ( @{$input_numbers} ) {
            $retval += $input_number;
        }
        return $retval;
    };  

    our number $average = sub {
        ( my number_arrayref $input_numbers ) = @ARG;
        return (total($input_numbers) / (scalar @{$input_numbers}));
    };

    our number_arrayref $above_average = sub {
        ( my number_arrayref $input_numbers ) = @ARG;
        my number $average = average($input_numbers);
        my number_arrayref $retval = [];
        foreach my number $input_number (@{$input_numbers}) {
            if ($input_number > $average) {
                push @{$retval}, $input_number;
            }
        }
        return $retval;
    };

    # [[[ OPERATIONS ]]]

    my number_arrayref $fred = [1 .. 10];
    my number $fred_above_average = above_average($fred);
    print '$fred is ', number_arrayref_to_string($fred), "\n";
    print 'The above-average elements of $fred are ', number_arrayref_to_string($fred_above_average), "\n";
    print '(Should be [6, 7, 8, 9, 10])', "\n\n";

    my number_arrayref $barney = [100, 1 .. 10];
    my number $barney_above_average = above_average($barney);
    print '$barney is ', number_arrayref_to_string($barney), "\n";
    print 'The above-average elements of $barney are ', number_arrayref_to_string($barney_above_average), "\n";
    print '(Should be just [100])', "\n";

Example execution and output:

    $ rperl -t LearningRPerl/Chapter4/exercise_3-subroutine_above_average.pl 
    $fred is [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    The above-average elements of $fred are [6, 7, 8, 9, 10]
    (Should be [6, 7, 8, 9, 10])

    $barney is [100, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    The above-average elements of $barney are [100]
    (Should be just [100])


Chapter 4, Exercise 4

The goal of this exercise is to become familiar with program state.
As a program executes, there may be one or more variables in the OPERATIONS section which store information that is important to the overall program; these variables are collectively known as the "state" of the program.
In the OPERATIONS section, a string variable $previous_name is created and set to the empty string q{}; this 1 variable is the state of the program.
Each of the 4 following lines in the OPERATIONS section utilize the program state by both reading from and writing to the variable $previous_name; the calls to the subroutine greet() read from $previous_name as the 2nd argument, and the return value of greet() is then written to $previous_name by the = assignment operator.
In the SUBROUTINES section, 1 subroutine greet() is defined which accepts 2 string arguments, the variables $name and $previous_name.
Inside greet(), a personalized greeting is displayed for the virtual person represented by the variable $name, after which an if() else conditional statement checks if there is any data inside $previous_name and further displays a customized comment about the state of the program. The state variable $previous_name is used to represent the very simple state of virtual people who have already been greeted.
Finally, the string variable $name is returned from greet() and passed to the = assignment operators in the OPERATIONS section, thereby updating the program state variable $previous_name to contain the value of $name.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 4, Exercise 4
    # Subroutine & driver to greet users

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator

    # [[[ SUBROUTINES ]]]

    our string $greet = sub {
        ( my string $name, my string $previous_name ) = @ARG;
        print 'Hi ', $name, '!  ';
        if ($previous_name eq q{}) {
            print 'You are the first one here!', "\n";
        }
        else {
            print $previous_name, ' is also here!', "\n";
        }
        return $name;
    };

    # [[[ OPERATIONS ]]]

    my string $previous_name = q{};
    $previous_name = greet('Fred', $previous_name);
    $previous_name = greet('Barney', $previous_name);
    $previous_name = greet('Wilma', $previous_name);
    $previous_name = greet('Betty', $previous_name);

Example execution and output:

    $ rperl -t LearningRPerl/Chapter4/exercise_4-subroutine_greet.pl 
    Hi Fred!  You are the first one here!
    Hi Barney!  Fred is also here!
    Hi Wilma!  Barney is also here!
    Hi Betty!  Wilma is also here!


Chapter 4, Exercise 5

The goal of this exercise is to become further familiarized with program state.
In the OPERATIONS section, the state variable $previous_names is an array of strings representing all previous virtual people who have been greeted.
In the SUBROUTINES section, the subroutine greet() prints the names of previously-greeted virtual people on one line, separated by spaces, by using the join operator.
Finally, the current value of $name is appended as the last element of the array $previous_names by the push operator, and $previous_names is then returned by greet() to update the program state.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 4, Exercise 5
    # Subroutine & driver to greet multiple users

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator

    # [[[ SUBROUTINES ]]]

    our string_arrayref $greet = sub {
        ( my string $name, my string_arrayref $previous_names ) = @ARG;
        print 'Hi ', $name, '!  ';
        if ((scalar @{$previous_names}) == 0) {
            print 'You are the first one here!', "\n";
        }
        else {
            print q{I've seen: }, (join q{ }, @{$previous_names}), "\n";
        }
        push @{$previous_names}, $name;
        return $previous_names;
    };

    # [[[ OPERATIONS ]]]

    my string_arrayref $previous_names = [];
    $previous_names = greet('Fred', $previous_names);
    $previous_names = greet('Barney', $previous_names);
    $previous_names = greet('Wilma', $previous_names);
    $previous_names = greet('Betty', $previous_names);

Example execution and output:

    $ rperl -t LearningRPerl/Chapter4/exercise_5-subroutine_greet_multiple.pl 
    Hi Fred!  You are the first one here!
    Hi Barney!  I've seen: Fred
    Hi Wilma!  I've seen: Fred Barney
    Hi Betty!  I've seen: Fred Barney Wilma


Chapter 5, Exercise 1

The goal of this exercise is to become familiar with file test operators, as well as opening, closing, and reading from a file.
In the CRITICS section, the ProhibitPostfixControls critic is disabled due to a bug in Perl::Critic and/or PPI which causes a false error.
In the SUBROUTINES section, 1 subroutine tac() is defined which accepts as input an array of file names received via the operating system's command-line arguments, and which has a void return value, meaning there is no return value for this subroutine.
Inside tac(), the reverse operator is called to reverse the order of the command-line arguments; for example, if the 3 file names fred barney betty are given as command-line arguments, then reverse causes the 3 strings betty barney fred to be stored in the array variable $command_line_arguments.
The outer foreach loop iterates through each $file_name in the now-reversed $command_line_arguments.
Next, 4 file test operators are called, to ensure each input file exists via -e, is readable via -r, is a regular (not special device) file via -f, and is comprised of plain text via -T.
The operator open is called with the < file input (read-only) argument, which opens each $file_name for reading via the $FILE filehandle variable.
The inner while loop reads in all lines from the current $FILE filehandle using the <$FILE> syntax (similar to <STDIN>) and stores the input file's lines in the string array variable $file_lines; then, the reverse operator is called again to reverse the order of the newly-read file lines.
Finally, the inner foreach loop displays the now-reversed $file_lines, and the close operator is called to close the $FILE filehandle so no more file access will occur for the current $file_name.
In the OPERATIONS section, the tac() subroutine is called with its only argument being a reference to the special @ARGV array, which is Perl's way of accessing the command-line arguments.
Before executing this program, the non-Perl `printf` program must be called to populate some test data into the 3 input files fred, barney, and betty; and after execution the `rm` program is called to delete the 3 input files.
To begin execution of this program via the `rperl` command, the program name and input file names must be enclosed in either 'single-quotes' or "double-quotes"; this tells RPerl the input file names are command-line arguments to be passed to the 1 specified program, instead of specifying additional RPerl programs.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 5, Exercise 1
    # Accept one or more input files, and print their contents line-by-line in reverse order

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitPostfixControls)  # SYSTEM SPECIAL 6: PERL CRITIC FILED ISSUE #639, not postfix foreach or if

    # [[[ SUBROUTINES ]]]

    our void $tac = sub {
        ( my string_arrayref $command_line_arguments ) = @ARG;
        $command_line_arguments = [ reverse @{$command_line_arguments} ];
        foreach my string $file_name ( @{$command_line_arguments} ) {
            if ( not( -e $file_name ) ) {
                croak 'ERROR: File ', $file_name, ' does not exist, croaking';
            }
            if ( not( -r $file_name ) ) {
                croak 'ERROR: File ', $file_name, ' is not readable, croaking';
            }
            if ( not( -f $file_name ) ) {
                croak 'ERROR: File ', $file_name, ' is not a regular file, croaking';
            }
            if ( not( -T $file_name ) ) {
                croak 'ERROR: File ', $file_name, ' is (probably) not text, croaking';
            }

            my integer $open_success = open my filehandleref $FILE, '<', $file_name;
            if ( not $open_success ) {
                croak 'ERROR: Failed to open file ', $file_name, ' for reading, croaking';
            }

            my string_arrayref $file_lines = [];

            while ( my string $file_line = <$FILE> ) {
                push @{$file_lines}, $file_line;
            }

            $file_lines = [ reverse @{$file_lines} ];

            foreach my string $file_line ( @{$file_lines} ) {
                print $file_line;
            }

            if ( not close $FILE ) {
                croak 'ERROR: Failed to close file ', $file_name, ' after reading, croaking';
            }
        }
    };

    # [[[ OPERATIONS ]]]

    tac( [@ARGV] );

Example execution, input, and output:

    $ printf "fred0\nfred1\nfred2\nfred3\nfred4\n" > fred
    $ printf "barney0\nbarney1\nbarney2\n" > barney
    $ printf "betty0\nbetty1\nbetty2\nbetty3\n" > betty
    $ rperl -t 'LearningRPerl/Chapter5/exercise_1-tac.pl fred barney betty'
    betty3
    betty2
    betty1
    betty0
    barney2
    barney1
    barney0
    fred4
    fred3
    fred2
    fred1
    fred0
    $ rm fred barney betty


Chapter 5, Exercise 2

The goal of this exercise is to become familiar with the string length operator and basic text formatting.
In the SUBROUTINES section, 1 subroutine right_justify_20() is defined, which accepts no input arguments and returns no values.
Inside right_justify_20(), an empty array of strings is initialized in the $input_strings variable, which is then populated with strings in a while loop collecting user input from <STDIN>.
The x string repeat operator is called to create a 60-character-wide horizontal ruler, which is then displayed by the print operator.
Finally, a foreach loop iterates through each $input_string and calls the length operator, thereby determining the correct number of spaces to prepend in order to achieve right justification alignment to the 20th character column. The - subtraction operator is passed a hard-coded column width of 20, and the chomp operator is called to remove any trailing newline characters which may be appended when the user presses the ENTER key after each input word.
In the OPERATIONS section, the only operation is a call to the right_justify_20() subroutine.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 5, Exercise 2
    # Accept one or more input lines, and print them in a right-justified 20-column format

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ SUBROUTINES ]]]

    our void $right_justify_20 = sub {
        my string_arrayref $input_strings = [];
        print 'Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:', "\n";
        while ( my string $input_string = <STDIN> ) {
            push @{$input_strings}, $input_string;
        }

        print "\n";
        print '1234567890' x 6;
        print "\n";

        foreach my string $input_string ( @{$input_strings} ) {
            chomp $input_string;
            print q{ } x ( 20 - ( length $input_string ) );
            print $input_string, "\n";
        }
    };

    # [[[ OPERATIONS ]]]

    right_justify_20();

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter5/exercise_2-right_justify.pl
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    howdy
    doody
    buffalo
    bob
    clarabell
    clown

    123456789012345678901234567890123456789012345678901234567890
                   howdy
                   doody
                 buffalo
                     bob
               clarabell
                   clown


Chapter 5, Exercise 3

The goal of this exercise is to become further familiarized with basic text formatting.
In the SUBROUTINES section, 1 subroutine right_justify_variable() is defined, which accepts no input arguments and returns no values.
Inside right_justify_variable(), the user is prompted to input a custom right justify width, stored in the integer variable $column_width.
The integer variable $ruler_width_tens is used to determine the number of characters displayed for the ruler, by tens; the default value for $ruler_width_tens is 6, which means a ruler width of 60 characters. If the user-supplied $column_width is greater-than 60, we scale it by 1/10 and add 1, thereby creating a new value for $ruler_width_tens which will always display a ruler wider than $column_width.
When the ruler is displayed, $ruler_width_tens is passed to the x string repeat operator instead of a hard-coded value of 6; likewise, when each $input_string is right justified, the - subtraction operator is passed $column_width instead of a hard-coded value of 20.
In the OPERATIONS section, the only operation is a call to the right_justify_variable() subroutine.
This exercise is otherwise identical to the previous exercise.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 5, Exercise 3
    # Accept column width followed by one or more input lines, and print lines in a right-justified format

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ SUBROUTINES ]]]

    our void $right_justify_variable = sub {
        my string_arrayref $input_strings = [];
        print 'Please input integer column width, then press <ENTER>:', "\n";
        my string $column_width_string = <STDIN>;
        my integer $column_width       = string_to_integer($column_width_string);

        print 'Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:', "\n";
        while ( my string $input_string = <STDIN> ) {
            push @{$input_strings}, $input_string;
        }

        my integer $ruler_width_tens = 6;    # default to ruler line width 60
        if ( $column_width > 60 ) {
            $ruler_width_tens = number_to_integer( $column_width / 10 ) + 1;
        }

        print "\n";
        print '1234567890' x $ruler_width_tens;
        print "\n";

        foreach my string $input_string ( @{$input_strings} ) {
            chomp $input_string;
            print q{ } x ( $column_width - ( length $input_string ) );
            print $input_string, "\n";
        }
    };

    # [[[ OPERATIONS ]]]

    right_justify_variable();

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter5/exercise_3-right_justify_variable.pl 
    Please input integer column width, then press <ENTER>:
    67
    Please input zero or more strings, separated by <ENTER>, ended by <CTRL-D>:
    howdy
    doody
    buffalo
    bob
    clarabell
    clown

    1234567890123456789012345678901234567890123456789012345678901234567890
                                                                  howdy
                                                                  doody
                                                                buffalo
                                                                    bob
                                                              clarabell
                                                                  clown


Chapter 6, Exercise 1

The goal of this exercise is to become familiar with hash data structures and hash operators.
In the SUBROUTINES section, 1 subroutine given_to_family_name() is defined, which accepts no input arguments and returns no values.
Inside given_to_family_name(), a hash data structure is created in the variable $names. Another term for hash is associative array, because each hash is comprised of key-value pairs, and we say there is a value "associated" with every key. Hash keys are bare words and must be unique, while hash values may be any specified data type and need not be unique. The 3 keys in $names are fred, barney, and wilma; the value of the key fred is 'flintstone'.
A first name is collected from the user by <STDIN> and stored in $given_name, then the chomp operator is called to remove the trailing newline recorded when the user presses the ENTER key.
An if conditional statement calls the exists and defined operators to ensure the user has entered a valid hash key, and an error is returned if there is no such key in the $names hash.
Finally, the thin-arrow syntax $names->{$given_name} is used to retrieve the hash value, and print is called to display the family name outputs.
In the OPERATIONS section, the only operation is a call to the given_to_family_name() subroutine.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 6, Exercise 1
    # Accept one input given (first) name, and print the corresponding family (last) name

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ SUBROUTINES ]]]

    our void $given_to_family_name = sub {
        my string_hashref $names = {
            fred => 'flintstone',
            barney => 'rubble',
            wilma => 'flintstone'
        };

        print 'Please input a given (first) name in all lowercase, then press <ENTER>:', "\n";
        my string $given_name = <STDIN>;
        chomp $given_name;

        if ((not exists $names->{$given_name}) or (not defined $names->{$given_name})) {
            croak 'ERROR: No family (last) name found for given (first) name ', $given_name, ', croaking', "\n";
        }

        print 'The family (last) name of ', $given_name, ' is ', $names->{$given_name}, q{.}, "\n";
    };

    # [[[ OPERATIONS ]]]

    given_to_family_name();

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter6/exercise_1-hash_family_names.pl 
    Please input a given (first) name in all lowercase, then press <ENTER>:
    fred
    The family (last) name of fred is flintstone.

    $ rperl -t LearningRPerl/Chapter6/exercise_1-hash_family_names.pl 
    Please input a given (first) name in all lowercase, then press <ENTER>:
    howdy
    ERROR: No family (last) name found for given (first) name howdy, croaking


Chapter 6, Exercise 2

The goal of this exercise is to become further familiarized with hash data structures and hash operators.
In the SUBROUTINES section, 1 subroutine unique_word_count() is defined, which accepts no input arguments and returns no values.
In unique_word_count(), an empty hash of integer values is created in the variable $word_counts; a while loop then collects user input strings, inside of which an if() else conditional statement updates the $word_counts hash. If a word is seen for the first time, then the corresponding $word_counts hash value will be set to 1, otherwise the count value will be incremented by 1.
Finally, a foreach loop iterates through the alphabetically-sorted keys of the $word_counts hash by calling the sort and keys operators, then the thin-arrow hash value retrieval syntax $word_counts->{$unique_word} is called, and the print operator displays the word count outputs.
In the OPERATIONS section, the only operation is a call to the unique_word_count() subroutine.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 6, Exercise 2
    # Accept a list of words, and print the count of each unique word

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt

    # [[[ SUBROUTINES ]]]

    our void $unique_word_count = sub {
        my integer_hashref $word_counts = {};

        print 'Please input zero or more words, separated by <ENTER>, ended by <CTRL-D>:', "\n";
        while (my string $input_word = <STDIN>) {
            chomp $input_word;
            if (not exists $word_counts->{$input_word}) {
                $word_counts->{$input_word} = 1;
            }
            else {
                $word_counts->{$input_word} += 1;
            }
        }

        print "\n", 'Unique word count:', "\n";

        foreach my string $unique_word (sort keys %{$word_counts}) {
            print $unique_word, ' appeared ', to_string($word_counts->{$unique_word}), ' time(s)', "\n";
        }
    };

    # [[[ OPERATIONS ]]]

    unique_word_count();

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter6/exercise_2-hash_unique_word_count.pl 
    Please input zero or more words, separated by <ENTER>, ended by <CTRL-D>:
    george
    jane
    judy
    elroy
    astro
    rosie
    george
    judy
    astro
    fred
    wilma
    barney
    betty
    pebbles
    bamm-bamm
    dino
    fred
    barney
    dino
    george
    fred
    barney
    astro
    dino
    jane
    judy
    betty
    wilma

    Unique word count:
    astro appeared 3 time(s)
    bamm-bamm appeared 1 time(s)
    barney appeared 3 time(s)
    betty appeared 2 time(s)
    dino appeared 3 time(s)
    elroy appeared 1 time(s)
    fred appeared 3 time(s)
    george appeared 3 time(s)
    jane appeared 2 time(s)
    judy appeared 3 time(s)
    pebbles appeared 1 time(s)
    rosie appeared 1 time(s)
    wilma appeared 2 time(s)


Chapter 6, Exercise 3

The goal of this exercise is to become further familiarized with hash data structures and basic text formatting.
In the SUBROUTINES section, 1 subroutine sort_env_vars() is defined, which accepts no input arguments and returns no values.
Inside sort_env_vars(), a hash of strings is created in the variable $env_vars, and it is initialized to contain the values of the special %ENV system hash, which stores the current user's environmental variables.
Next, 2 integer variables $env_var_length and $left_column_width are created, and $left_column_width is initialized to the value 0. A foreach loop iterates through all environmental variables, measuring the string length of each $env_var by the length operator, and using an if conditional statement to test if the current $env_var_length is greater-than the existing $left_column_width. If $env_var_length is large enough, then $left_column_width is updated, thereby resulting in the value of $left_column_width being equal to the longest $env_var_length.
After the foreach loop, $left_column_width is incremented by an additional 2 character widths, allowing for 2 or more spaces between the hash keys and their respective values when displayed.
Finally, there is another foreach loop below the first, again iterating through all $env_vars, and printing the keys at the beginning of each output line. The - subtraction operator is called to find the difference between the current key length and the pre-calculated $left_column_width, then the x string repeat operator is called to create the corresponding number of blank spaces as padding between the key and its value. The thin-arrow hash value retrieval syntax $env_vars->{$env_var} is called, and the print operator displays the hash value outputs.
In the OPERATIONS section, the only operation is a call to the sort_env_vars() subroutine.

    #!/usr/bin/env perl

    # Learning RPerl, Chapter 6, Exercise 3
    # Print sorted environmental variables

    # [[[ HEADER ]]]
    use RPerl;
    use strict;
    use warnings;
    our $VERSION = 0.001_000;

    # [[[ CRITICS ]]]
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator

    # [[[ SUBROUTINES ]]]

    our void $sort_env_vars = sub {
        my string_hashref $env_vars = {%ENV};

        my integer $env_var_length;
        my integer $left_column_width = 0;
        foreach my string $env_var ( sort keys %{$env_vars} ) {
            $env_var_length = length $env_var;
            if ( $env_var_length > $left_column_width ) {
                $left_column_width = $env_var_length;
            }
        }

        $left_column_width += 2;

        print 'Environmental variables:', "\n";

        foreach my string $env_var ( sort keys %{$env_vars} ) {
            print $env_var;
            print q{ } x ( $left_column_width - ( length $env_var ) );
            print $env_vars->{$env_var}, "\n";
        }
    };

    # [[[ OPERATIONS ]]]

    sort_env_vars();

Example execution, input, and output:

    $ rperl -t LearningRPerl/Chapter6/exercise_3-hash_sort_env_vars.pl 
    Environmental variables:
    COLORTERM                 xfce4-terminal
    DESKTOP_SESSION           xubuntu
    DISPLAY                   :0.0
    GDMSESSION                xubuntu
    GDM_LANG                  en_US
    HOME                      /home/jlpicard
    LANG                      en_US.UTF-8
    LANGUAGE                  en_US
    LOGNAME                   jlpicard
    PATH                      .:script:bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
    PERL5LIB                  blib/lib:lib:/home/jlpicard/perl5/lib/perl5
    PERL_LOCAL_LIB_ROOT       /home/jlpicard/perl5
    PERL_MB_OPT               --install_base "/home/jlpicard/perl5"
    PERL_MM_OPT               INSTALL_BASE=/home/jlpicard/perl5
    PWD                       /home/jlpicard
    SHELL                     /bin/bash
    TERM                      xterm
    USER                      jlpicard


APPENDIX B: RPERL COMMAND-LINE ARGUMENTS

The `rperl` "command-line interface" (CLI) is a software program which serves as the primary front-end user interface for RPerl. When called for execution, the `rperl` command must be provided with at least one input file name, which tells RPerl which file(s) to compile. In addition, `rperl` may also be provided with one or more optional "command-line arguments", which tell RPerl how to behave and exactly how to compile the input file(s). Command-line arguments are also commonly referred to as "options", or in this specific context just "arguments".

Command-line arguments which are listed as one of the "modes" will modify the way RPerl compiles the input source code file(s). All RPerl modes are written using the longhand format:

--mode foo=BAR

Arguments which are listed as one of the "flags" are written using the shorthand --foo or even shorter -f formats. Most flags are shortcuts to common modes.

Flags and other non-mode arguments which are not shortcuts to common modes will either modify some non-compiling behavior of RPerl (such as enabling additional information output), or cause RPerl to do something other than fully execute and compile input source code files (such as displaying version information then exiting).

Below is a comprehensive list of all RPerl command-line arguments, as reported by the `rperl -?` command.

Additional explanations for each argument are provided with emphasis.

B.1: Help

--help ...OR... -h ...OR... -?
    Print this (relatively) brief help message for command-line usage.
    For additional explanations, run the command `perldoc RPerl::Learning` and see Appendix B.

The help option simply displays the content of this appendix, except for the additional explanations such as this one.

Yes, this is the "Appendix B" you're looking for.

B.2: Version

--version ...OR... -v
--vversion ...OR... -vv
    Print version number and copyright information.
    Repeat as 'vv' for more technical information, similar to `perl -V` configuration summary argument.
    Lowercase 'v' not to be confused with uppercase 'V' in 'Verbose' argument.

The single-v version option displays a brief message similar to the output of the `perl -v` command.

The double-v version option displays a collection of technical information similar to the output of the `perl -V` command, including subcompile flags for the C++ compiler and system paths. This option may be useful to RPerl system develpers when debugging subcompile issues or other problems.

B.3: Input Files

--infile=MyFile.pm ...OR... -i=MyFile.pm
    Specify input file, may be repeated for multiple input files.
    Argument prefix '--infile' may be entirely omitted.
    Argument prefix MUST be omitted to specify wildcard for multiple input files.

The input file option tells RPerl which file(s) to compile, and is the only command-line argument which is always required every time the `rperl` command is fully executed (not just `rperl -?`, `rperl -v`, etc). Even if the argument prefix '--infile' is omitted for convenience, one or more input file names must still be specified to avoid an RPerl error message.

All input files must end in '.pl' for RPerl programs, or '.pm' for RPerl modules.

B.4: Output Files

--outfile=MyCompiledModule ...OR... -o=MyCompiledModule
--outfile=my_compiled_program ...OR... -o=my_compiled_program
    Specify output file prefix, may be repeated for multiple output files.
    RPerl *.pm input file with PERL ops will create MyCompiledModule.pmc output file.
    RPerl *.pl input file with PERL ops will create my_compiled_program (or my_compiled_program.exe on Windows) output file.
    RPerl *.pm input file with CPP  ops will create MyCompiledModule.pmc, MyCompiledModule.cpp, & MyCompiledModule.h output files.
    RPerl *.pl input file with CPP  ops will create my_compiled_program (or my_compiled_program.exe on Windows) & my_compiled_program.cpp output files.
    Argument may be entirely omitted, foo.* input file will default to foo.* output file(s).

The output file option tells RPerl which file(s) to generate by specifying a file prefix. RPerl automatically determines the output file suffixes, depending upon the operations (AKA "ops") mode of either PERL or CPP. The PERL ops mode is a test mode, so only Perl output files are generated; the CPP ops mode is a real compile mode, so C++ output files are generated.

If the output file option is omitted, then the original prefix of each input file is used as the prefix of the output file, as one would expect. For example, without an output file option provided, an input file named 'foo.pl' in CPP ops mode will generate the 'foo.cpp' and 'foo' (or 'foo.exe' in Windows) output files. With an output file option of -o=bar provided, the same 'foo.pl' input file will instead generate the 'bar.cpp' and 'bar' (or 'bar.exe') output files.

B.5: C++ Compiler

--CXX=/path/to/compiler
    Specify path to C++ compiler for use in subcompile modes, equivalent to '--mode CXX=/path/to/compiler' or 'CXX' manual Makefile argument, 'g++' by default.

The C++ compiler option tells RPerl which subcompiler to use for converting C++ source code into binary machine code.

This option is a shorthand provided for brevity, please see: "B.17: Modes, C++ Compiler"

If both the shorthand and longhand forms of the C++ compiler option are omitted, then RPerl will utilize the GNU C++ compiler or give an error if it is not installed.

B.6: Modes, Magic

--mode magic=LOW ...OR... -m magic=LOW
--mode magic=MEDIUM ...OR... -m magic=MEDIUM
--mode magic=HIGH ...OR... -m magic=HIGH
    Specify magic mode, LOW by default.
    If set to LOW, accept low-magic (static) Perl source code in the source code input file(s).
    If set to MEDIUM, accept medium-magic (mostly static) Perl source code in the source code input file(s).
    If set to HIGH, accept high-magic (dynamic) Perl source code in the source code input file(s).
    Because only low-magic mode is supported at this time, this option does not currently have any effect.

The magic mode option tells RPerl which grammar to use when parsing the input source code.

Low-magic parsing uses the Grammar.eyp EYAPP grammar file, medium-magic parsing will uses GrammarMedium.eyp, and high-magic parsing will use GrammarHigh.eyp.

B.7: Modes, Code

--mode code=PERL ...OR... -m code=PERL
--mode code=CPP ...OR... -m code=CPP
    Specify source code mode, CPP by default.
    If set to PERL, generate Perl source code in the source code output file(s).
    If set to CPP, generate C++ source code in the source code output file(s).
    PERL operations mode forces PERL code mode; CPP operations mode forces CPP code mode.
    Because code mode is dependent upon operations mode, this option does not currently have any effect.

The source code mode option tells RPerl which grammar to use when generating the output source code.

Perl code generating will produce output source code which can be parsed using RPerl itself.

C++ code generating will produce output source code which can be parsed using any standards-compliant C++ compiler.

B.8: Modes, Operations

--mode ops=PERL ...OR... -m ops=PERL
--mode ops=CPP ...OR... -m ops=CPP
    Specify operations mode, CPP by default.
    If set to PERL, generate Perl operations in the source code output file(s).
    If set to CPP, generate C++ operations in the source code output file(s).
    PERL ops mode forces PERL types mode & PARSE or GENERATE compile mode; PERLOPS_PERLTYPES is test mode, does not actually compile.

The two most important RPerl modes are "operations" (AKA "ops") and "data types" (AKA "types").

The ops mode option tells RPerl if it should use Perl operations or C++ operations in the final output. Perl operations are slow, but provide support for high-magic functionality. C++ operations are fast, but only support low-magic functionality.

Normal Perl is considered an interpreted language and uses Perl operations. When an RPerl application is run in Perl operations mode, then the application is interpreted instead of compiled. When an RPerl application runs in C++ operations mode, then it is compiled.

B.9: Modes, Types

--mode types=PERL ...OR... -m types=PERL
--mode types=CPP ...OR... -m types=CPP
--mode types=DUAL ...OR... -m types=DUAL
    Specify data types mode, CPP by default.
    If set to PERL, generate Perl data types in the source code output file(s).
    If set to CPP, generate C++ data types in the source code output file(s).
    If set to DUAL, generate both Perl and C++ data types in the source code output file(s).
    DUAL mode allows generate-once-compile-many types, selected by '#define __FOO__TYPES' in lib/rperltypes_mode.h or `gcc -D__FOO__TYPES` manual subcompile argument.

The types mode option tells RPerl if it should use Perl data types or C++ data types in the final output. Just like RPerl's ops mode, Perl data types are slow and high-magic, while C++ types are fast and low-magic.

Normal interpreted Perl uses Perl types (and Perl ops). When an RPerl application is run in Perl types mode, then the application is compiled but only partial speed optimization is achieved. When an RPerl application runs in C++ types mode, then it is compiled with full speed optimization.

Because RPerl's first goal is performance, support for C++-ops-Perl-types mode and C++-ops-dual-types mode is limited to hand-compiled RPerl applications only, with full support planned for a future version release. Meanwhile, the high-performance C++-ops-C++-types mode is currently supported for compiled RPerl applications, and the slow Perl-ops-Perl-types mode (AKA "test mode") is supported for testing purposes.

B.10: Modes, Integer Type

--mode type_integer=LONG ...OR... -m type_integer=LONG
--mode type_integer=LONG__LONG ...OR... -m type_integer=LONG__LONG
    Specify native C++ integer data type, same as internal Perl type by default.
    If set to LONG, utilize 'long' as native type for 'integer', at least 32 bits;
    If set to LONG__LONG, utilize 'long long' as native type for 'integer', at least 64 bits.

The type_integer mode option tells RPerl which native C++ data type to use as the RPerl 'integer' data type when compiled in CPPOPS_CPPTYPES mode.

According to the C language standard, 'long' will provide at least 32 bits of storage, and 'long long' will provide at least 64 bits.

By default, RPerl will set this mode option to match the internal C data type used by the Perl interpreter itself, in order to produce the same output in both PERLTYPES and CPPTYPES modes.

B.11: Modes, Number Type

--mode type_number=DOUBLE ...OR... -m type_number=DOUBLE
--mode type_number=LONG__DOUBLE ...OR... -m type_number=LONG__DOUBLE
    Specify native C++ number data type, same as internal Perl type by default.
    If set to DOUBLE, utilize 'double' as native type for 'number', usually at least 32 bits, but not guaranteed;
    If set to LONG__DOUBLE, utilize 'long double' as native type for 'number', usually at least 64 bits, but not guaranteed.

The type_number mode option tells RPerl which native C++ data type to use as the RPerl 'number' data type when compiled in CPPOPS_CPPTYPES mode.

According to the C language standard, there is no guaranteed bit length for these data types, although 'long double' will not provide less storage than 'double'.

On some systems, 'double' is 32 bits and 'long double' is 64 bits, while on other systems 'double' is 64 bits and 'long double' is 128 bits. Every system is different, and other bit lengths may also be encountered.

By default, RPerl will set this mode option to match the internal C data type used by the Perl interpreter itself, in order to produce the same output in both PERLTYPES and CPPTYPES modes.

B.12: Modes, Type Checking

--mode check=OFF ...OR... -m check=OFF
--mode check=ON ...OR... -m check=ON
--mode check=TRACE ...OR... -m check=TRACE
    Specify data type checking mode, TRACE by default.
    If set to OFF, do not perform dynamic type checking, only built-in C++ static type checking.
    If set to ON, perform dynamic type checking in addition to built-in C++ static type checking.
    If set to TRACE, perform dynamic type checking in addition to built-in C++ static type checking, with subroutine-and-variable trace information.

The data type checking mode option tells RPerl if it should perform additional tests to ensure correct data types are used as "subroutine arguments" (not to be confused with "command-line arguments"). RPerl subroutines can accept 1 or more arguments (AKA "parameters"), each of which must be of a specific data type.

In C++-ops-C++-types mode, all data types must be determined at compile time and thus can not change (AKA "static"), so the C++ compiler performs all necessary type checking. In any Perl types mode, the data types can not be determined until runtime and thus can change (AKA "dynamic"), so RPerl must itself perform type checking for subroutine arguments.

If the type checking mode is set to TRACE, then the name of the problematic subroutine and argument will be "traced" and provided in the RPerl error message. This setting can be useful for RPerl application developers and debugging purposes.

B.13: Modes, Dependencies

--mode dependencies=OFF ...OR... -m dependencies=OFF
--mode dependencies=ON ...OR... -m dependencies=ON
    Specify dependencies mode, ON by default.
    If set to OFF, do not search for or compile dependencies.
    If set to ON, recursively search for dependencies and subdependencies, include as additional input file(s).
    WARNING: Disabling dependencies will likely cause errors or undefined behavior.

The dependencies mode option tells RPerl if it should compile all the Perl application code which is used by the input RPerl application code via the use operator. If 1 RPerl application input file is provided which has 10 dependencies, and each of those 10 dependencies has 10 sub-dependencies of its own, then the RPerl compiler actually has at least 111 input files to compile, and possibly many more than 111 if any of the sub-dependencies has sub-sub-dependencies, etc.

If your RPerl application depends upon other RPerl code, then all should go as planned. If your RPerl application depends upon other non-RPerl code, then you will likely encounter problems as RPerl attempts to compile the non-RPerl components.

If the dependencies mode option is set to OFF, then you will likely encounter problems as the compiled RPerl application attempts to locate the compiled form of each dependency, which will not be found unless manually compiled or copied from a previous compile. Thus, disabling compilation of dependencies is strongly discouraged, because RPerl will not function properly if any dependency (or sub-dependency, etc) is not compiled.

B.14: Modes, Uncompile

--mode uncompile=OFF ...OR... -m uncompile=OFF
--mode uncompile=SOURCE ...OR... -m uncompile=SOURCE
--mode uncompile=BINARY ...OR... -m uncompile=BINARY
--mode uncompile=INLINE ...OR... -m uncompile=INLINE
--mode uncompile=SOURCE_BINARY ...OR... -m uncompile=SOURCE_BINARY
--mode uncompile=SOURCE_BINARY_INLINE ...OR... -m uncompile=SOURCE_BINARY_INLINE
    Specify uncompile mode, OFF by default.
    If set to SOURCE, delete all generated C++ output source code (not subcompiled) files: *.cpp, *.h, *.pmc
    If set to BINARY, delete all generated C++ output binary (subcompiled) files: *.o, *.a, *.so, *.exe, non-suffixed executables
    If set to INLINE, delete all generated C++ output Inline::CPP files: _Inline/ directory
    If set to SOURCE_BINARY, delete both SOURCE and BINARY files.
    If set to SOURCE_BINARY_INLINE, delete SOURCE, BINARY, and INLINE files.
    For *.pm Perl module input files, BINARY and INLINE are equivalent.

The uncompile mode option tells RPerl if it should do the opposite of compiling, which is to delete any compiled files created by previous executions of RPerl.

Many different files may be created each time RPerl is executed, which are separated into 3 different groups: human-readable RPerl source code files (AKA "SOURCE"), machine-readable RPerl binary files (AKA "BINARY"), and a mixture of both source and binary files created by the Inline software in the "_Inline" directory (AKA "INLINE"). It is recommended to delete all 3 groups using the "SOURCE_BINARY_INLINE" setting, to avoid possible problems with mismatching or partially-compiled components.

RPerl will attempt to overwrite any files generated by previous RPerl executions, so it is generally unnecessary to explicitly use the uncompile option when executing RPerl multiple times with the same input file(s) each time.

B.15: Modes, Compile

--mode compile=OFF ...OR... -m compile=OFF
--mode compile=PARSE ...OR... -m compile=PARSE
--mode compile=GENERATE ...OR... -m compile=GENERATE
--mode compile=SAVE ...OR... -m compile=SAVE
--mode compile=SUBCOMPILE ...OR... -m compile=SUBCOMPILE
    Specify compile mode, SUBCOMPILE by default.
    If set to PARSE, begin with RPerl input source code file(s), and end with RPerl abstract syntax tree output data structure.
    If set to GENERATE, begin with RPerl input source code file(s), and end with RPerl and/or C++ output source code in memory.
    If set to SAVE, begin with RPerl input source code file(s), and end with RPerl and/or C++ output source code file(s) saved to disk.
    If set to SUBCOMPILE, begin with RPerl input source code file(s), and end with C++ output binary file(s).

The compile mode option tells RPerl how far to progress through the compile phase categories of "PARSE", "GENERATE", "SAVE", and "SUBCOMPILE". As detailed in Section 2.3, there are multiple distinct phases for both parsing and saving, while generating and subcompiling are only one phase each. The compile mode option groups all 3 parsing phases together as PARSE, and both saving phases as SAVE.

The OFF, PARSE, and GENERATE phase categories may be useful for RPerl system developers while debugging the RPerl compiler itself. No output files are saved to disk for these 3 phase categories, so they are not particularly useful for RPerl application developers.

Both the SAVE and SUBCOMPILE phase categories save output files to disk, so they may be useful to RPerl application developers. The SAVE phase category does not call the C++ compiler to perform subcompiling, which can be done manually by the developer if they so choose.

B.16: Modes, Subcompile

--mode subcompile=OFF ...OR... -m subcompile=OFF
--mode subcompile=ASSEMBLE ...OR... -m subcompile=ASSEMBLE
--mode subcompile=ARCHIVE ...OR... -m subcompile=ARCHIVE
--mode subcompile=SHARED ...OR... -m subcompile=SHARED
--mode subcompile=STATIC ...OR... -m subcompile=STATIC
--mode subcompile=DYNAMIC ...OR... -m subcompile=DYNAMIC
    Specify subcompile mode, DYNAMIC by default.
    If set to ASSEMBLE, generate *.o object binary output file(s).
    If set to ARCHIVE, generate *.a object archive binary output file(s).
    If set to SHARED, generate *.so shared object binary output file(s).
    If set to STATIC, generate statically-linked *.exe or non-suffixed executable binary output file(s).
    If set to DYNAMIC, generate dynamically-linked *.exe or non-suffixed executable binary output file(s).

The subcompile mode option tells RPerl what output is desired from the C++ compiler. This option is recommended for use by RPerl developers with prior C++ compiler experience.

The "ASSEMBLE", "ARCHIVE", and "SHARED" subcompile modes will generate non-executable binary files from both *.pl RPerl program input files and *.pm RPerl module input files. The generated binary files are reusable components; experienced RPerl and C++ developers may utilize the C++ compiler to "link" (combine) together multiple source and binary files in order to create a final binary output file.

The "STATIC" subcompile mode is invalid for *.pm RPerl module input files; for *.pl RPerl programs, STATIC mode creates a binary executable which contains all necessary components included within 1 file.

For *.pm input files, the "DYNAMIC" subcompile mode will use Inline and PMC file(s) to automatically interface between the compiled RPerl output and any non-compiled Perl code which calls the compiled output. For *.pl input files, DYNAMIC mode creates a binary executable which does not contain all necessary components, and thus must search for external library files such as "libperl.a" or "libperl.so" during every execution.

The C++ compiler and linker can be a bit complicated and confusing, one possible place to start learning more is the GCC compiler documentation:

https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html

B.17: Modes, C++ Compiler

--mode CXX=/path/to/compiler ...OR... -m CXX=/path/to/compiler
    Specify path to C++ compiler for use in subcompile modes, equivalent to '--CXX=/path/to/compiler' or 'CXX' manual Makefile argument, 'g++' by default.

The C++ compiler mode option tells RPerl which C++ compiler to utilize throughout the various phases of compilation. RPerl needs a C++ compiler to function in general, and a good C++ compiler is especially important during the subcompile phases.

Use of the GNU C++ compiler `g++` is strongly recommended, although in theory you should be able to use any C++ compiler which supports the C++11 standard.

https://en.wikipedia.org/wiki/C++11

B.18: Modes, Parallelize

--mode parallel=OFF ...OR... -m parallel=OFF
--mode parallel=OPENMP ...OR... -m parallel=OPENMP
--mode parallel=MPI ...OR... -m parallel=MPI ((( COMING SOON!!! )))
--mode parallel=OPENCL ...OR... -m parallel=OPENCL ((( COMING SOON!!! )))
    Specify automatic parallelization mode, OFF by default.
    If set to OFF, do not automatically parallelize any input files.
    If set to OPENMP, automatically parallelize all eligible input files for use on shared-memory OpenMP systems.
    If set to MPI (COMING SOON), automatically parallelize all eligible input files for use on distributed-memory MPI systems.
    If set to OPENCL (COMING SOON), automatically parallelize all eligible input files for use on heterogeneous or GPU systems.

The parallel mode option tells RPerl if it should automatically parallelize the RPerl input file(s) for faster runtime execution performance on parallel computing hardware platforms.

The "OPENMP" parallel mode utilizes the OpenMP software to provide support for shared memory computing hardware platforms, such as multi-core CPUs and supercomputers.

COMING SOON: The "MPI" parallel mode utilizes the MPI software to provide support for distributed memory computing hardware platforms, such as clusters and the cloud.

COMING SOON: The "OPENCL" parallel mode utilizes the OpenCL software to provide support for heterogeneous computing hardware platforms, such as CPU/GPU combinations, GPUs, and FPGAs.

B.19: Modes, Parallelize, Number Of Cores

--mode num_cores=2 ...OR... -m num_cores=2 ...OR... --num_cores=2 ...OR... -num=2
--mode num_cores=4 ...OR... -m num_cores=4 ...OR... --num_cores=4 ...OR... -num=4
--mode num_cores=8 ...OR... -m num_cores=8 ...OR... --num_cores=8 ...OR... -num=8
...ETC...
    Specify number of CPU cores to utilize for OPENMP automatic parallelization mode, 4 by default.

The num_cores mode option tells RPerl how many CPU cores to use during runtime execution of an auto-parallelized RPerl input file.

This option is only applicable when the parallel mode option is set to "OPENMP", for use on shared memory computing hardware platforms.

It may be best to choose a value which is a power of 2, although this is not required.

B.20: Modes, Execute

--mode execute=OFF ...OR... -m execute=OFF
--mode execute=ON ...OR... -m execute=ON
    Specify execute mode, ON by default.
    If set to OFF, do not load or run any user-supplied program(s).
    If set to ON with one *.pl Perl program input file, load and run the program.
    If set to ON with more than one *.pl Perl program input file, do not load or run any programs.

The execute mode option tells RPerl if it should actually run a *.pl RPerl program input file, or just compile it without running.

The default behavior for *.pl program files is to compile-then-execute, in order to provide a similar feel to normal Perl's interpret-while-executing behavior.

When *.pm RPerl module input file(s) are provided, then no execution occurs because module files are not executable.

B.21: Modes, Source Code Labels

--mode label=OFF ...OR... -m label=OFF
--mode label=ON ...OR... -m label=ON
    Specify source section label mode, ON by default.
    If set to OFF, generate minimal output source code, may save disk space.
    If set to ON, generate some informative labels in output source code, may be more human-readable.

The source code label mode option tells RPerl if it should create human-readable labels embedded as comments in the C++ output source code. This option may be useful to RPerl system developers when comparing an RPerl application's original uncompiled source code input with the compiled C++ source code output.

B.22: Flags, Verbose

--Verbose ...OR... -V
--noVerbose ...OR... -noV
    Include additional user information in output, or not.
    If enabled, equivalent to `export RPERL_VERBOSE=1` shell command.
    Disabled by default.
    Uppercase 'V' not to be confused with lowercase 'v' in 'version' argument.

The verbose flag option tells RPerl to display additional user-friendly information during various phases of compilation.

Verbose output is generated using the RPerl::verbose() RPerl system subroutine.

If the verbose and debug flags are not enabled, then a problem-free RPerl compilation will produce no output whatsoever, in order to preserve the "*nix" tradition of displaying no unnecessary output during successful execution of system software.

https://en.wikipedia.org/wiki/*nix

B.23: Flags, Debug

--Debug ...OR... -D
--noDebug ...OR... -noD
    Include debugging & system diagnostic information in output, or not.
    If enabled, equivalent to `export RPERL_DEBUG=1` shell command.
    Disabled by default.
    Uppercase 'D' not to be confused with lowercase 'd' in 'dependencies' argument.

The debug flag option tells RPerl to display additional developer-friendly information during various phases of compilation.

Debugging output is generated using the identical RPerl::debug() & RPerl::diag() RPerl system subroutines.

Use of both the verbose and debug flags `rperl -V -D` is recommended for RPerl system developers.

B.24: Flags, Warnings

--Warnings ...OR... -W
--noWarnings ...OR... -noW
    Include system warnings in output, or not.
    If disabled, equivalent to `export RPERL_WARNINGS=0` shell command.
    Enabled by default.

The warnings flag option tells RPerl if it should display non-fatal warnings.

The default behavior is that RPerl should display warnings, in order to encourage users and developers alike to address problems before they become fatal errors, or cause other unexpected or undefined behavior.

B.25: Flags, Test

--test ...OR... -t
    Test mode: Perl ops, Perl types, Parse & Generate (no Save or Compile)
    If enabled, equivalent to '--mode ops=PERL --mode types=PERL --mode compile=GENERATE' arguments.
    Disabled by default.

The test flag option tells RPerl to operate in test mode, which means RPerl will utilize Perl operations and Perl data types, and the resulting output code will not actually be saved to disk or subcompiled. Because the Perl-ops-Perl-types mode is the fully-implemented reference design for the RPerl language, the test mode may be useful to RPerl application developers in order to check the correctness of an RPerl application which includes some RPerl feature that is not yet implemented in the optimized C++ modes.

B.26: Flags, Low Magic

--low ...OR... -l
    Accept low-magic (static) Perl source code in the source code input file(s).
    Enabled by default, equivalent to '--mode magic=LOW' argument.
    Because only low-magic mode is supported at this time, this option does not currently have any effect.

The low-magic flag option tells RPerl to operate in low-magic mode, which supports a small subset of Perl and runs at the fastest speed.

This option is a shorthand provided for brevity, please see: "B.6: Modes, Magic"

B.27: Flags, Medium Magic

--medium
    Accept medium-magic (mostly static) Perl source code in the source code input file(s).
    Disabled by default, equivalent to '--mode magic=MEDIUM' argument.
    Because only low-magic mode is supported at this time, this option does not currently have any effect.
    Shorthand '-m' used for '--mode' argument, not '--medium' argument.

The medium-magic flag option tells RPerl to operate in medium-magic mode, which supports a large subset of Perl and runs at a very fast speed.

This option is a shorthand provided for brevity, please see: "B.6: Modes, Magic"

B.28: Flags, High Magic

--high
    Accept high-magic (dynamic) Perl source code in the source code input file(s).
    Disabled by default, equivalent to '--mode magic=HIGH' argument.
    Because only low-magic mode is supported at this time, this option does not currently have any effect.
    Shorthand '-h' used for '--help' argument, not '--high' argument.

The high-magic flag option tells RPerl to operate in high-magic mode, which supports all of Perl and runs at a fast speed.

This option is a shorthand provided for brevity, please see: "B.6: Modes, Magic"

B.29: Flags, Dependencies

--dependencies ...OR... -d
--nodependencies ...OR... -nod
    Follow and compile dependencies, or not.
    Enabled by default, equivalent to '--mode dependencies=ON' argument.
    Lowercase 'd' not to be confused with uppercase 'D' in 'Debug' argument.

The dependencies flag option tells RPerl if it should recursively follow and compile all dependencies of all RPerl source code input files.

This option is a shorthand provided for brevity, please see: "B.13: Modes, Dependencies"

B.30: Flags, Uncompile

--uncompile ...OR... -u
--nouncompile ...OR... -nou
--uuncompile ...OR... -uu
--nouuncompile ...OR... -nouu
--uuuncompile ...OR... -uuu
--nouuncompile ...OR... -nouuu
    Uncompile (delete C++ source code and/or binary output files), or not.
    Repeat as 'uu' and 'uuu' for more thorough file removal.
    Do not confuse uncompile with decompile (recreate RPerl source code from C++ source code or binary output files), which does not currently exist.
    '-u' equivalent to '--mode uncompile=SOURCE --mode compile=OFF --mode execute=OFF' arguments.
    '-uu' equivalent to '--mode uncompile=SOURCE_BINARY --mode compile=OFF --mode execute=OFF' arguments.
    '-uuu' equivalent to '--mode uncompile=SOURCE_BINARY_INLINE --mode compile=OFF --mode execute=OFF' arguments.
    Disabled by default.

The uncompile flag option tells RPerl to uncompile the input file(s) instead of compiling them. The "u" of "uncompile" may be repeated as "uu" or "uuu" for greater uncompile effect (more files deleted).

This option is a shorthand provided for brevity, please see: "B.14: Modes, Uncompile"

B.31: Flags, Compile

--compile ...OR... -c
--nocompile ...OR... -noc
    Generate & subcompile C++ source code, or not.
    Enabled by default, equivalent to '--mode compile=SUBCOMPILE' argument.

The compile flag option tells RPerl if it should compile the source code input file(s), which is obviously enabled by default because RPerl is a compiler and thus is designed to compile as its primary purpose.

If disabled via --nocompile or -noc, RPerl will still perform the PARSE and GENERATE compile phases, but will not perform the SAVE or SUBCOMPILE phases, so no output files are saved to disk or subcompiled to binary using the C++ compiler. This compile phase effect is similar to test mode in that files are not saved or subcompiled, but without automatically setting the operations or data types modes to Perl.

This option is a shorthand provided for brevity, please see: "B.15: Modes, Compile"

B.32: Flags, Subcompile, Assemble

--assemble
    Assemble subcompile mode, output *.o object file(s).
    If enabled, equivalent to '--mode subcompile=ASSEMBLE' argument or `gcc -c` manual subcompile argument.
    Disabled by default.

The assemble flag tells RPerl to run the C++ assembler and save non-executable binary *.o object file(s) to disk.

This option is a shorthand provided for brevity, please see: "B.16: Modes, Subcompile"

B.33: Flags, Subcompile, Archive

--archive
    Archive subcompile mode, output *.a object archive file(s).
    If enabled, equivalent to '--mode subcompile=ARCHIVE' argument or `gcc -c` followed by `ar` manual subcompile command.
    Disabled by default.

The archive flag tells RPerl to run the C++ compiler and save non-executable binary *.a archive library file(s) to disk.

This option is a shorthand provided for brevity, please see: "B.16: Modes, Subcompile"

B.34: Flags, Subcompile, Shared Object

--shared
    Shared subcompile mode, output *.so shared object file(s).
    If enabled, equivalent to '--mode subcompile=SHARED' argument or `gcc -shared` manual subcompile command.
    Disabled by default.

The archive flag tells RPerl to run the C++ compiler and save non-executable binary *.so shared object library file(s) to disk.

This option is a shorthand provided for brevity, please see: "B.16: Modes, Subcompile"

B.35: Flags, Subcompile, Static

--static
--nostatic
    Static subcompile mode, output *.exe or non-suffixed statically-linked executable file(s).
    If disabled, equivalent to '--mode subcompile=DYNAMIC' argument or `gcc` manual subcompile command.
    If enabled, equivalent to '--mode subcompile=STATIC' argument or `gcc -static` manual subcompile command.
    Disabled by default.

The static flag tells RPerl to run the C++ compiler and save executable binary *.exe or non-suffixed statically-linked program file(s) to disk.

There is no equivalent --dynamic flag, because dynamic subcompile mode is already the default behavior.

This option is a shorthand provided for brevity, please see: "B.16: Modes, Subcompile"

B.36: Flags, Parallelize

--parallel ...OR... -p
--noparallel ...OR... -nop
    Automatically parallelize input code, or not.
    Disabled by default.
    Equivalent to '--mode parallel=OPENMP' argument.

The parallel flag tells RPerl if it should auto-parallelize the RPerl input file(s).

This option is a shorthand provided for brevity, please see: "B.18: Modes, Parallelize"

B.37: Flags, Execute

--execute ...OR... -e
--noexecute ...OR... -noe
    Run input code after argumental compile, or not.
    Enabled by default for *.pl program input files, always disabled for *.pm module input files or multiple input files.
    Equivalent to '--mode execute=ON' argument.

The execute flag tells RPerl if it should execute a *.pl RPerl program input file.

This option is a shorthand provided for brevity, please see: "B.20: Modes, Execute"

APPENDIX C: RPERL CRITICS

    # DEV NOTE: disable RequireTidyCode because Perl::Tidy is not perfect and may complain even if the code is tidy;
    
    # disable PodSpelling because calling the external spellchecker can cause errors such as aspell's "No word lists can be found for the language FOO";
    
    # disable RequireExplicitPackage because 'use RPerl;' comes before package name(s), and Grammar.eyp will catch any other violations
    
    '-exclude'  => ['RequireTidyCode', 'PodSpelling', 'RequireExplicitPackage'],
    
    
    # [[[ CRITICS ]]]
    
    ## no critic qw(ProhibitUselessNoCritic ProhibitMagicNumbers RequireCheckedSyscalls)  # USER DEFAULT 1: allow numeric values & print operator
    
    ## no critic qw(RequireInterpolationOfMetachars)  # USER DEFAULT 2: allow single-quoted control characters & sigils
    
    ## no critic qw(ProhibitConstantPragma ProhibitMagicNumbers)  # USER DEFAULT 3: allow constants
    
    ## no critic qw(ProhibitExplicitStdin)  # USER DEFAULT 4: allow <STDIN> prompt
    
    ## no critic qw(RequireBriefOpen)  # USER DEFAULT 5: allow open() in perltidy-expanded code
    
    ## no critic qw(ProhibitCStyleForLoops)  # USER DEFAULT 6: allow C-style for() loop headers
    
    ## no critic qw(RequireTrailingCommas)  # USER DEFAULT X: no trailing commas in RPerl lists  # NEED ANSWER: RPerl is mostly array refs, do we even need this?
    
    
    ## no critic qw(ProhibitUselessNoCritic PodSpelling)  # DEVELOPER DEFAULT 1a: allow unreachable & POD-commented code, must be on line 1
    
    ## no critic qw(ProhibitUnreachableCode RequirePodSections RequirePodAtEnd)  # DEVELOPER DEFAULT 1b: allow POD & unreachable or POD-commented code, must be after line 1
    
    ## no critic qw(ProhibitStringySplit ProhibitInterpolationOfLiterals)  # DEVELOPER DEFAULT 2: allow string test values
    
    
    ## no critic qw(ProhibitStringyEval)  # SYSTEM DEFAULT 1: allow eval()
    
    ## no critic qw(ProhibitCascadingIfElse)  # SYSTEM DEFAULT 2: allow argument-handling logic
    
    ## no critic qw(Capitalization ProhibitMultiplePackages ProhibitReusedNames)  # SYSTEM DEFAULT 3: allow multiple & lower case package names
    
    ## no critic qw(RequireCheckingReturnValueOfEval)  # SYSTEM DEFAULT 4: allow eval() test code blocks
    
    
    ## no critic qw(ProhibitBooleanGrep)  # SYSTEM SPECIAL 1: allow grep
    
    ## no critic qw(ProhibitAutoloading RequireArgUnpacking)  # SYSTEM SPECIAL 2: allow Autoload & read-only @ARG
    
    ## no critic qw(ProhibitParensWithBuiltins ProhibitNoisyQuotes)  # SYSTEM SPECIAL 3: allow auto-generated code
    
    ## no critic qw(ProhibitExcessMainComplexity)  # SYSTEM SPECIAL 4: allow complex code outside subroutines, must be on line 1
    
    ## no critic qw(ProhibitExcessComplexity)  # SYSTEM SPECIAL 5: allow complex code inside subroutines, must be after line 1
    
    ## no critic qw(ProhibitPostfixControls)  # SYSTEM SPECIAL 6: PERL CRITIC FILED ISSUE #639, not postfix foreach or if
    
    ## no critic qw(ProhibitDeepNests)  # SYSTEM SPECIAL 7: allow deeply-nested code
    
    ## no critic qw(ProhibitNoStrict)  # SYSTEM SPECIAL 8: allow no strict
    
    ## no critic qw(RequireUseStrict)  # SYSTEM SPECIAL 9: allow omitted strict
    
    ## no critic qw(RequireBriefOpen)  # SYSTEM SPECIAL 10: allow complex processing with open filehandle
    
    ## no critic qw(ProhibitBacktickOperators)  # SYSTEM SPECIAL 11: allow system command execution
    
    ## no critic qw(ProhibitCascadingIfElse)  # SYSTEM SPECIAL 12: allow complex conditional logic
    
    ## no critic qw(RequireCarping)  # SYSTEM SPECIAL 13: allow die instead of croak
    
    ## no critic qw(ProhibitAutomaticExportation)  # SYSTEM SPECIAL 14: allow global exports from Config.pm
    
    
    # COMBO CRITICS
    
    ## no critic qw(ProhibitUselessNoCritic PodSpelling ProhibitExcessMainComplexity)  # DEVELOPER DEFAULT 1a: allow unreachable & POD-commented code; SYSTEM SPECIAL 4: allow complex code outside subroutines, must be on line 1

APPENDIX D: RPERL GRAMMAR

D.1: Eyapp Grammar Format & Sections

RPerl's grammar is written using the Eyapp computer programming language, which is a combination of normal Perl 5 and grammar expressions.

The grammar expression sections in Eyapp source code are written using an implementation of the Extended Backus-Naur Form (EBNF) language.

The file lib/RPerl/Grammar.eyp contains the uncompiled RPerl grammar, which is passed through the `eyapp` compiler command once to generate the output file lib/RPerl/Grammar.pm, which is then used by the `rperl` compiler command to parse RPerl input source code files.

Inside the lib/RPerl/Grammar.eyp file, there are several labeled file sections, which can be grouped into 4 major categories:

For more information, please view the following links:

D.2: Lexicon Token Types

Following is a list of all RPerl tokens in all 4 lexicon sections, along with examples of valid matching lexeme input.

The list must be in correct order for all regexes to match; earlier declarations get tried first, thus highly-specific tokens such as RPerl keywords and built-in operators appear first, while the least-specific tokens such as user-defined words appear last. This ordering can be considered "lexical matching", and is distinct from operator precedence and associativity as covered in the next section.

D.2.1: Whitespace

[[[ LEXICON TOKENS, WHITESPACE ]]]

D.2.2: Types & Reserved Words

[[[ LEXICON TOKENS, TYPES & RESERVED WORDS ]]]

D.2.3: Operators

[[[ LEXICON TOKENS, OPERATORS ]]]

D.2.4: Punctuation & User-Defined Words

[[[ LEXICON TOKENS, PUNCTUATION & USER-DEFINED WORDS ]]]

D.3: Syntax Arity, Fixity, Precedence, Associativity

Operator "arity" is a technical term which means the number of input operands accepted by a specific built-in operator, or the number of input arguments accepted by a user-defined subroutine. An operator or function which accepts 0 input arguments is known as "nullary", 1 argument as "unary", 2 arguments as "binary", 3 arguments as "ternary", and so forth. The exit; operator may be called as nullary; the ++ increment operator is unary; the + addition operator is binary; and the substr operator may be called as ternary. Not to be confused with "a ternary operator", meaning any operator which accepts 3 operands, there is one specific operator known as "the ternary operator", which is a special kind of conditional operator accepting 3 input arguments. An operator or function which may accept more than one number of arguments is known as "variadic". Some RPerl operators are variadic, such as substr which may accept 2, 3, or 4 arguments. RPerl does not currently support variadic user-defined subroutine.

Operator Arity on Wikipedia

Operator "fixity" is the notation form indicating the location of an operator when placed relative to its own input operands. "Prefix" operators are located before their operands, "infix" between operands, and "postfix" after operands. Additionally, operators which must be placed both before and after their operands are said to be of "closed" fixity (AKA "closefix"), while operators capable of more than one placement location are called "mixfix". Prefix notation is also known as "Polish notation", and postfix is called "Reverse Polish" notation. The abs absolute value is a prefix operator; the + addition operator is infix; and the ++ increment operator can be called as postfix. The -( ) negative-with-parentheses operator is of closed fixity, because the parentheses component must appear both before and after the enclosed operand. Parentheses are always of closed fixity; in normal Perl, the - negative (without parentheses) is a prefix operator, but in RPerl we only allow the closed fixity -( ) negative-with-parentheses operator in order to avoid grammar ambiguity, because the same - dash (AKA hyphen) character is utilized for both the - negative and - subtraction operators. The ++ increment operator may also be called as prefix, so it may be classified as mixfix.

Prefix Notation on Wikipedia

Infix Notation on Wikipedia

Postfix Notation on Wikipedia

Operator "precedence", also known as "order-of-operations", is a methodology used to determine which operator is executed first when 2 or more operators are adjacent to one another and parentheses are not used to explicitly separate them. A numeric precedence from 1 to 24 is assigned to each operator, and the operator with the lowest precedence number is given priority to execute first. Low precedence number equals high priority. The * arithmetic multiplication operator has a precedence number of 7, and + addition has a precedence of 8, so a + b * c is equivalent to a + (b * c), not (a + b) * c.

Operator Precedence on Wikipedia

Operator "associativity" is used to further determine precedence when multiple operators of the same priority are adjacent to one another. Each operator is designated as left-associative, right-associative, or non-associative. (Wikipedia incorrectly identifies associativity as a synonym for fixity, which is different, as described above.) Normal arithmetic operators are left-associative, meaning a - b - c is equivalent to (a - b) - c, not a - (b - c). Some operators such as mathematic power (AKA exponentiation) are right-associative, meaning a ** b ** c is equivalent to a ** (b ** c). Operators which are not meant to be chained together are non-associative, such as the .. list range operator which takes scalar values as input but generates an array as output, so a .. b .. c is incorrect usage and will cause an error.

Operator Associativity on Wikipedia

In the following list of operators copied directly from Grammar.eyp, later declaration gets higher priority, so all precedence numbers appear in strictly descending order from 24 to 1. (Implementation of operator arity and fixity are a bit less straightforward, and are not easily copied-and-pasted in one succinct list directly out of Grammar.eyp.) All operator arity, fixity, precedence, and associativity are taken directly from Perl 5.

Operator Precedence & Associativity in Perl 5 Documentation [1]

[[[ SYNTAX, OPERATOR PRECEDENCE & ASSOCIATIVITY ]]]

    %left       OP24_LOGICAL_OR_XOR
    %left       OP23_LOGICAL_AND
    %right      OP22_LOGICAL_NEG
    %left       OP21_LIST_COMMA
    %left       OP20_HASH_FATARROW
    %right      OP19_LOOP_CONTROL_SCOLON
    %right      OP19_LOOP_CONTROL
    %right      OP19_VARIABLE_ASSIGN_BY
    %right      OP19_VARIABLE_ASSIGN
    %right      OP18_TERNARY
    %nonassoc   OP17_LIST_RANGE
    %left       OP16_LOGICAL_OR
    %left       OP15_LOGICAL_AND
    %left       OP14_BITWISE_OR_XOR
    %left       OP13_BITWISE_AND
    %nonassoc   OP12_COMPARE_EQ_NE
    %nonassoc   OP11_COMPARE_LT_GT
    %nonassoc   OP10_NAMED_UNARY
    %nonassoc   OP10_NAMED_UNARY_SCOLON
    %left       OP09_BITWISE_SHIFT
    %left       OP08_STRING_CAT
    %left       OP08_MATH_ADD_SUB
    %left       OP07_MATH_MULT_DIV_MOD
    %left       OP07_STRING_REPEAT
    %left       OP06_REGEX_BIND
    %left       OP06_REGEX_PATTERN
    %right      OP05_MATH_NEG_LPAREN
    %right      OP05_LOGICAL_NEG
    %right      OP04_MATH_POW
    %nonassoc   OP03_MATH_INC_DEC
    %left       OP02_HASH_THINARROW
    %left       OP02_ARRAY_THINARROW
    %left       OP02_METHOD_THINARROW_NEW
    %left       OP02_METHOD_THINARROW
    %left       OP01_NAMED
    %left       OP01_NAMED_SCOLON
    %left       OP01_CLOSE
    %left       OP01_OPEN
    %left       OP01_QW
    %left       OP01_NAMED_VOID_SCOLON
    %left       OP01_NAMED_VOID_LPAREN
    %left       OP01_NAMED_VOID
    %left       OP01_PRINT

D.4: Syntax Production Rules

The EBNF metasyntax implemented by Eyapp is of the form:

ProductionRule: First Alternative 'foo' | Second Alternative 'bar' | ... | Last Alternative 'quux' ;

In this example, ProductionRule is a non-terminal left-hand-side (LHS) symbol, is followed by the : reduction metasymbol, and may be reduced (replaced) by any of the right-hand-side (RHS) sequences of terminal and non-terminal symbols, themselves separated by the | alternation (logical or) metasymbol. In other words, each LHS may become any of its corresponding RHS alternatives.

Terminal symbols are enclosed in single-quotes as with 'foo', never appear on the LHS, and are taken as literal data with no transformations applied. Eyapp treats terminal symbols as tokens which only match one hard-coded lexeme, which is the string appearing inside the single-quotes, foo in this example.

D.4.1: File Formats

[[[ SYNTAX PRODUCTION RULES, FILE FORMATS ]]]

    CompileUnit:             Program | (ModuleHeader Module)+ ;
    Program:                 SHEBANG Critic? USE_RPERL Header Critic* Include* Constant* Subroutine* Operation+ ;
    ModuleHeader:            Critic? USE_RPERL? 'package' WordScoped ';' Header ;
    Module:                  Package | Class ;
    Package:                 Critic* Include* Constant* Subroutine+ LITERAL_NUMBER ';' ;
    Header:                  'use strict;' 'use warnings;' USE_RPERL_AFTER? 'our' VERSION_NUMBER_ASSIGN;
    Critic:                  '## no critic qw(' WORD+ ')';
    Include:                 USE WordScoped ';' | USE WordScoped OP01_QW ';' ;
    Constant:                'use constant' WORD_UPPERCASE OP20_HASH_FATARROW TypeInnerConstant Literal ';' ;
    Subroutine:              'our' Type VARIABLE_SYMBOL '= sub {' SubroutineArguments? Operation* '}' ';' ;
    SubroutineArguments:     LPAREN_MY Type VARIABLE_SYMBOL (OP21_LIST_COMMA MY Type VARIABLE_SYMBOL)* ')' OP19_VARIABLE_ASSIGN '@ARG;' ;
    Class:                   'use parent qw(' WordScoped ')' ';' Include Critic* Include* Constant* Properties MethodOrSubroutine* LITERAL_NUMBER ';' ;
    Properties:              'our hashref $properties' OP19_VARIABLE_ASSIGN LBRACE HashEntryProperties (OP21_LIST_COMMA HashEntryProperties)* '}' ';' |
                             'our hashref $properties' OP19_VARIABLE_ASSIGN LBRACE '}' ';' ;
    Method:                  'our' TYPE_METHOD VARIABLE_SYMBOL '= sub {' MethodArguments? Operation* '}' ';' ;
    MethodArguments:         LPAREN_MY Type SELF (OP21_LIST_COMMA MY Type VARIABLE_SYMBOL)* ')' OP19_VARIABLE_ASSIGN '@ARG;' ;
    MethodOrSubroutine:      Method | Subroutine;

Code Examples:

CompileUnit

Class

D.4.2: Operations

[[[ SYNTAX PRODUCTION RULES, OPERATIONS ]]]

    Operation:               Expression ';' | OP01_NAMED_SCOLON | OP10_NAMED_UNARY_SCOLON | Statement ;
    Operator:                LPAREN OP01_PRINT FHREF_SYMBOL_BRACES ListElements ')' |
                             OP01_NAMED SubExpression | LPAREN OP01_NAMED ListElement OP21_LIST_COMMA ListElements ')' |
                             OP01_OPEN MY TYPE_FHREF FHREF_SYMBOL OP21_LIST_COMMA LITERAL_STRING OP21_LIST_COMMA SubExpression |
                             OP01_CLOSE FHREF_SYMBOL | OP03_MATH_INC_DEC Variable | Variable OP03_MATH_INC_DEC | SubExpression OP04_MATH_POW SubExpression |
                             OP05_LOGICAL_NEG SubExpression | OP05_MATH_NEG_LPAREN SubExpression ')' | SubExpression OP06_REGEX_BIND OP06_REGEX_PATTERN |
                             SubExpression OP07_STRING_REPEAT SubExpression | SubExpression OP07_MATH_MULT_DIV_MOD SubExpression |
                             SubExpression OP08_MATH_ADD_SUB SubExpression | SubExpression OP08_STRING_CAT SubExpression | SubExpression OP09_BITWISE_SHIFT SubExpression |
                             OP10_NAMED_UNARY SubExpression | OP10_NAMED_UNARY | SubExpression OP11_COMPARE_LT_GT SubExpression |
                             SubExpression OP12_COMPARE_EQ_NE SubExpression | SubExpression OP13_BITWISE_AND SubExpression |
                             SubExpression OP14_BITWISE_OR_XOR SubExpression | SubExpression OP15_LOGICAL_AND SubExpression | SubExpression OP16_LOGICAL_OR SubExpression |
                             SubExpression OP17_LIST_RANGE SubExpression | SubExpression OP18_TERNARY VariableOrLiteral COLON VariableOrLiteral |
                             OP22_LOGICAL_NEG SubExpression | SubExpression OP23_LOGICAL_AND SubExpression | SubExpression OP24_LOGICAL_OR_XOR SubExpression ;
    OperatorVoid:            OP01_PRINT (STDOUT_STDERR)? ListElements ';' | OP01_PRINT FHREF_SYMBOL_BRACES ListElements ';' |
                             OP01_NAMED_VOID_SCOLON | OP01_NAMED_VOID_LPAREN ListElements? ')' ';' | OP01_NAMED_VOID ListElements ';' | 
                             OP01_NAMED ListElement OP21_LIST_COMMA ListElements ';' | OP19_LOOP_CONTROL_SCOLON | OP19_LOOP_CONTROL LoopLabel ';' ;
    Expression:              Operator | WORD_UPPERCASE LPAREN ')' | CONSTANT_CALL_SCOPED | WordScoped LPAREN ListElements? ')' |
                             Variable OP02_METHOD_THINARROW LPAREN ListElements? ')' | WordScoped OP02_METHOD_THINARROW_NEW ')' ;
    SubExpression:           Expression | 'undef' | Literal | Variable | ArrayReference | ArrayDereference | HashReference | HashDereference | LPAREN SubExpression ')' ;
    SubExpressionOrInput:    SubExpression | FHREF_SYMBOL_IN | STDIN;
    SubExpressionOrVarMod:   SubExpression | VariableModification;
    Statement:               Conditional | (LoopLabel COLON)? Loop | OperatorVoid | VariableDeclaration | VariableModification ';' ;
    Conditional:             'if' LPAREN SubExpression ')' CodeBlock ('elsif' LPAREN SubExpression ')' CodeBlock)* ('else' CodeBlock)? ;
    Loop:                    LoopFor | LoopForEach | LoopWhile ;
    LoopFor:                 'for' MY TYPE_INTEGER VARIABLE_SYMBOL LPAREN SubExpression OP17_LIST_RANGE SubExpression ')' CodeBlock |
                             'for' LPAREN_MY TYPE_INTEGER VARIABLE_SYMBOL OP19_VARIABLE_ASSIGN OpNamedScolonOrSubExp VARIABLE_SYMBOL OP11_COMPARE_LT_GT OpNamedScolonOrSubExp SubExpressionOrVarMod ')' CodeBlock ;
    LoopForEach:             'foreach' MY Type VARIABLE_SYMBOL LPAREN ListElements ')' CodeBlock ;
    LoopWhile:               'while' LPAREN SubExpression ')' CodeBlock | 'while' LPAREN_MY Type VARIABLE_SYMBOL OP19_VARIABLE_ASSIGN SubExpressionOrInput ')' CodeBlock;
    CodeBlock:               LBRACE Operation+ '}' ;

Code Examples:

Operation

Operator

OperatorVoid

Expression

SubExpression

SubExpressionOrInput

Statement

D.4.3: Variable Data

[[[ SYNTAX PRODUCTION RULES, VARIABLE DATA ]]]

    Variable:                VariableSymbolOrSelf VariableRetrieval* ;
    VariableRetrieval:       OP02_ARRAY_THINARROW SubExpression ']' | OP02_HASH_THINARROW SubExpression '}' | OP02_HASH_THINARROW WORD '}' ;
    VariableDeclaration:     MY Type VARIABLE_SYMBOL ';' | MY Type VARIABLE_SYMBOL OP19_VARIABLE_ASSIGN OpNamedScolonOrSubExpIn | 
                             MY Type VARIABLE_SYMBOL OP02_ARRAY_THINARROW SubExpression ']' OP19_VARIABLE_ASSIGN 'undef' ';' | MY TYPE_FHREF FHREF_SYMBOL ';' ;
    VariableModification:    Variable OP19_VARIABLE_ASSIGN SubExpressionOrInput | Variable OP19_VARIABLE_ASSIGN_BY SubExpression ;
    ListElements:            ListElement (OP21_LIST_COMMA ListElement)* ;
    ListElement:             SubExpression | TypeInner SubExpression | OP01_QW | ARGV;
    ArrayReference:          LBRACKET ListElements? ']' ;
    ArrayDereference:        '@{' Variable '}' | '@{' TypeInner? ArrayReference '}' ;
    HashEntry:               VarOrLitOrOpStrOrWord OP20_HASH_FATARROW TypeInner? SubExpression | HashDereference | ENV ;
    HashEntryProperties:     OpStringOrWord OP20_HASH_FATARROW TypeInnerProperties ;
    HashReference:           LBRACE HashEntry (OP21_LIST_COMMA HashEntry)* '}' | LBRACE '}' ;
    HashDereference:         '%{' Variable '}' | '%{' TypeInner? HashReference '}' ;

Code Examples:

ListElement

D.4.4: User-Defined Words

[[[ SYNTAX PRODUCTION RULES, USER-DEFINED WORDS ]]]

    WordScoped:              WORD | WORD_SCOPED ;
    LoopLabel:               WORD_UPPERCASE ;
    Type:                    WORD | WORD_SCOPED | TYPE_INTEGER ;
    TypeInner:               MY Type '$TYPED_' OpStringOrWord OP19_VARIABLE_ASSIGN ;
    TypeInnerProperties:     MY Type '$TYPED_' OpStringOrWord OP19_VARIABLE_ASSIGN SubExpression | 
                             MY Type '$TYPED_' OpStringOrWord OP02_ARRAY_THINARROW SubExpression ']' OP19_VARIABLE_ASSIGN 'undef' ;
    TypeInnerConstant:       MY Type '$TYPED_' WORD_UPPERCASE OP19_VARIABLE_ASSIGN ;
    VariableOrLiteral:       Variable | Literal ;
    VarOrLitOrOpStrOrWord:   Variable | Literal | OpStringOrWord ;
    VariableSymbolOrSelf:    VARIABLE_SYMBOL | SELF ;
    Literal:                 LITERAL_NUMBER | LITERAL_STRING ;
    OpNamedScolonOrSubExp:   OP01_NAMED_SCOLON | OP10_NAMED_UNARY_SCOLON | SubExpression ';' ;
    OpNamedScolonOrSubExpIn: OP01_NAMED_SCOLON | OP10_NAMED_UNARY_SCOLON | SubExpressionOrInput ';' ;
    OpStringOrWord:          OP24_LOGICAL_OR_XOR | OP23_LOGICAL_AND | OP22_LOGICAL_NEG | OP19_LOOP_CONTROL_SCOLON | OP19_LOOP_CONTROL | OP12_COMPARE_EQ_NE |
                             OP11_COMPARE_LT_GT | OP10_NAMED_UNARY | OP08_MATH_ADD_SUB | OP07_MATH_MULT_DIV_MOD | OP07_STRING_REPEAT | OP01_NAMED | OP01_CLOSE | 
                             OP01_OPEN | OP01_NAMED_VOID | OP01_PRINT | WORD ;

There are no additional code examples for this section, all pertinent examples are contained in the previous sections.


APPENDIX E: BEYOND THE ROADRUNNER

Intermediate RPerl

The Scallion Book


Mastering RPerl

The Sword Book


GLOSSARY

A

Angle-Brackets

Used to wrap input streams such as <STDIN> or <$MY_FILEHANDLE>.

Also known as "chevrons"; created using "less-than" < and "greater-than" > characters.

B

Braces

"curly-brackets" AKA "curly-braces" AKA "braces" { }

See "Brackets"

Brackets

A group of opening and closing character pairs, including:

"square-brackets" AKA "brackets" [ ]

"round-brackets" AKA "parentheses" ( )

"curly-brackets" AKA "curly-braces" AKA "braces" { }

"angle-brackets" AKA "chevrons" AKA "less-than" and "greater-than" < >

See "Square-Brackets"

See "Parentheses"

See "Braces"

See "Angle-Brackets"

P

Parentheses

NEED DEFINITION

S

Square-Brackets

NEED DEFINITION


SEE ALSO

RPerl

rperl


AUTHOR

William N. Braswell, Jr.

mailto:william.braswell@NOSPAM.autoparallel.com