Parrot: A Virtual Machine For Everyone
This article is an attempt to provide the readers an introductory thread on a very useful and widely used concept of the computer systems, 'The Virtual Machines'. First we shall try to demystify this concept and then explore further an open source virtual machine 'Parrot'.
Well the term 'Virtual Machine' denotes a machine that is not there physically but still is there and working, tangled? Ok! let's start with a very simple real life example : Suppose a person named A goes to some foreign land and along with him/her he/she has a foreign language interpreter B. A only knows his/her native language but B knowns so many foreign languages in addition to the native language. When A wanted to get some work done in the foreign land then A gives some instructions to B in the native language and then B interprets those instructions and translates those instructions into the foreignlanguage instructions and instruct some person C native of the foreign land to get the desired work of A done. So wherever A goes now, he/shesimply instructs B in the native language and then B gets work done forA using the proper translations and communication. So B is providing Aan uniform interface irrespective of the country and language and this is the main concept behind the 'Virtual Machine'. If we make ananalogy then in our example, B is the 'Virtual Machine', A is the 'End User' and C is the 'Underlying Execution Mechanism'. This is shown in Fig-1.

Click to see the full image
Another well known example is - The Java Virtual Machine(JVM). We most of the people are at least familiar with the term JVM (even if not knowing the JVM intricacies) because it's a key technology for the Internet browsing. But why we should care for it as most of the computer users use JVM transparently while world wide web surfing? If we dig more into it, we find that JVM is that critical component which provides an uniform interface for world wide web irrespective of the uneven structure of the Internet and individual computer system. I hope now you all are getting the concept of 'Virtual Machines' to gulp down easily.
Let us now discuss how we get our desired work done using computers. First we prepare some data, instructions and their flow, combined known as programs. Then we feed these programs to some tools known as compilers/interpreters to translate our programs in the language of the underlying computing hardware and then after the execution, our program produces the desired result. If we take the example of Static Programming Languages (in static languages, first objects are declared and then used) like C, C++ etc. then the programs written in these languages are first compiled into the native machine instructions of the underlying CPU and as such can be executed by the hardware. On the other hand, Dynamic languages (in dynamic languages, there is no need to declare the objects for using them) such as Perl, Python and Java, are usually compiled to CPU-independent instructions. A 'Virtual Machine' (sometimes known as an interpreter) is required to execute those instructions. This total mechanism is shown in fig-2.

Click to see the full image
We can visualize 'Virtual Machines' as software created CPUs in the computers which provide users their desired functionalities irrespective of the underlying hardware. These Software-CPUs have instruction sets that are underlying real CPU-independent, known as Byte Codes. 'Virtual Machine' concept is widely used in computing world to solve some very significant problems of the computing, but most of the time the usages of the 'Virtual Machines' are transparent to the end users. 'Virtual Machines' are typically used within programs and operating systems to solve some of these real life computing problems : sharing the same hardware among many programs by partitioning the hardware, software portability across different operating systems, running an older software on a newer computer etc. For example we can easily run the various operating systems within an operating system using Operating System Virtualisation softwares like VMWare and Xen(
http://www.xen.sourceforge.net or
http://www.cl.cam.ac.uk/Research/SRG/netos/xen/index.html).
All of these uses of 'Virtual Machines' are very important to the way that we compute today. This much background is enough for this article because 'Virtual Machines' itself is a very wide topic to explore fully.
An Introduction To "Parrot"
Now we extend our discussion of 'Virtual Machines' by exploring some practical aspects of what we discussed above. For this, now we put our steps in the league of a real world virtual machine 'Parrot'. 'Parrot' is an open source 'Virtual Machine'(VM)/Interpreter that was designed and developed from scratch to provide an uniform runtime for dynamic languages (e.g., Perl, Python, Tcl, Ruby etc.) and at present, this open source project has reached to a very matured form. Although it is being developed as an efficient runtime engine for upcoming Perl 6 code, but it can serve very efficiently for many diverse kind of languages. So if you want an execution engine for your custom designed, portable programming language, then it could be a smart choice.
The design foundation of 'Parrot' is the needs of dynamically typed languages (such as Perl and Python), so it should be able to run programs written in these languages more efficiently than VMs developed with static languages in mind (JVM, .NET). 'Parrot' is also designed to provide interoperability between languages that compile to it. In theory, you will be able to write a class in Perl, subclass it in Python and then instantiate and use that subclass in a Tcl program.
But why we should prefer 'Parrot' over other VMs? The answer of this basic question lies in the architecture of 'Parrot'. 'Parrot' is designed to resemble hardware CPUs as close as possible. That's why the Parrot VM have a register architecture, rather than a stack architecture (JVM have a stack architecture). It also have extremely low-level operations. The reasoning for this decision is rimarily that by resembling the underlying hardware to some extent, it's possible to compile down 'Parrot' Byte Code to efficient native machine language. The register architecture also allows to make use of the literature available on optimizing compilation for hardware CPUs.
Getting And Installing "Parrot"
The official site of 'Parrot' is :
www.parrotcode.org. As 'Parrot' is continuously in development, periodic releases will appear on CPAN (ftp://ftp.cpan.org/pub/CPAN/authors/id/L/LT/LTOETSCH/parrot-(version number).tar.gz), the current release is version 0.3.1 'Wart'. The easiest solution is to grab the most recent snapshot of the Parrot SVN repository. It's a tar-gzipped download of a recent checkout of Parrot, updated every six hours. You can find it here:
http://svn.perl.org/snapshots/parrot/parrot-latest.tar.gz
After downloading Parrot, you extract it using the command:
tar -zxvf parrot-(version number).tar.gz.
This creates a parrot directory in which you can find many subdirectories like docs (parrot documentation), examples (parrot programming examples), src (parrot source code), languages (some languages built around parrot runtime) etc.
Now you change your present working directory to the parrot directory using command :
cd parrot
To build 'Parrot', all you need is Perl 5.005_03 or later, a C compiler (any ANSI C compliant compiler should do) and some reasonable form of make. The first step to building 'Parrot' is to run the Configure.pl program, which looks at your platform and decides how 'Parrot' should be built. This is done by typing:
perl Configure.pl
Once this is complete, run the make program (sometimes called nmake or dmake) :
make
This should complete the building process, giving you a working 'Parrot' executable. Parrot has an extensive regression test suite. This can be run by typing:
make test
If you want 'Parrot' debugger then you have to build it separately by typing :
make pdb
Now to do the proper installation, first you change to root login and then type :
make install
This creates a parrot-(version number) subdirectory in /usr/local and copies parrot (and pdb, if built) executable to /usr/local/parrot-(version number)/bin . For further tweaking the building and installation process, different options could be tried as given in the README file in the installation directory.
You have to include /usr/local/parrot-(version number)/bin in the path to launch 'Parrot' from anywhere.
Programming With "Parrot"
After finishing the basics of 'Virtual Machines' and installing the 'Parrot', now we wet our hands by programming the 'Parrot'. To program the 'Parrot', first we have to know about the forms, in which we can instruct 'Parrot' to perform the desired task. We shall mainly use two forms of instructions in our programming examples: PASM (Parrot Assembly) is the assembly language for the 'Parrot' Software- CPU and if we program in PASM then we have to take care of everything from the registers assignments to the calling conventions of the functions etc. The second form is, PIR (Parrot Intermediate Representation) that is a level above PASM and it is more friendly than the PASM because it does many things automatically by hiding away some of the low level details. There's an another form also, PAST (Parrot Abstract Syntax Tree) is useful in writing compilers for 'Parrot' run time but we shall not make use of it in this article because writing compilers itself is a very wide topic to explore.
All of the above forms of instructions are automatically converted inside 'Parrot' to PBC (Parrot Byte Code). This is the machine language of 'Parrot' Software-CPU and understood by the 'Parrot' interpreter. Like machine language, It is not intended to be human-readable or human-writable, but unlike the other forms execution can start immediately, without the need for an assembly phase. Parrot' Byte Code is platform independent.
The 'Parrot' Software-CPU instruction set includes arithmetic and logical operators, compare and ranch/jump (for implementing loops, if...then constructs, etc), finding and storing global and lexical variables, working with classes and objects, calling subroutines and methods along with their parameters, I/O, threads and more. As we mentioned earlier that the 'Parrot' VM is register based. So we should expect number of fast-access storing units called registers in the 'Parrot' Software-CPU. Our guess is very correct, there are 4 types of register, 32 each in numbers, in 'Parrot': integers (I), numbers (N), strings (S) and PMCs (P). These registers are named I0, N0, S0, P0 ....... I31, N31, S31, P31 etc. Integer registers are the same size as a word ( 32 bit on 32 bit CPUs and 64 bit on 64 bit CPUs) on the machine 'Parrot' is running on and number registers also map to a native floating point type.
'Parrot' provides garbage collection, meaning that in 'Parrot' programs we do not need to worry about freeing memory explicitly; it will be freed when it is no longer in use (that is, no longer referenced) whenever the garbage collector runs.
Now we shall present some programming examples using PASM and PIR. First we enter the program using any text editor and save with the filname.pasm or filename.pir. To run the programs, you can assemble and execute it in one pass with:
parrot filename.pasm (or .pir)
If you would rather assemble separately, you can first generate a Byte Code file with :
parrot -o finalfilename.pbc filename.pasm (or .pir).
Then, you can run the pre-assembled Byte Code with:
parrot finalfilename.pbc.
Here is our fist example for both the PASM and PIR:
hello.pasm
- Our First Example --------> Comment
print "Hello Open Source!\n" --------> Prints the string
end --------> Stops execution
print operator is polymorphic - it can print integers, numbers, strings, and string constants. The output of the program is: Hello Open Source printed on the console. The PIR version of the program looks like :
hello.pir
.sub oss
- Our First Example
print "Hello Open Source!\n"
.end
First, .sub oss tells the compiler that we are beginning a subroutine called oss. In this case, the name of the sub isn't important, only it's position - unless you tell it otherwise, parrot will execute the first subroutine that's defined. The .end at the end tells parrot that our sub definition is complete. Note that PIR requires that any code you give it must be placed in a subroutine.
As we mentioned that 'Parrot' is a Register Architecture VM so our next example deals with the registers manipulation through it. The PASM version of the program is:
regs.pasm
set I1, 7
set N1, 3.142857
set S1, "Programming"
print I1
print ", "
print N1
print ", "
print S1
print "\n"
end
Assignment between registers is done with the set operator. To copy between registers of different types, the set operator is also used.
The output of the program is :
7, 3.142857, Programming
The PIR (a minor change by including named variable) version looks like :
regs.pir
.sub temps
$I97 = 7
$N61 = 3.142857
$S70 = "Programming"
.local num nagic_no
magic_no = 22/7.0
print $I97
print ", "
print $N61
print ", "
print $S70
print ", "
print magic_no
print "\n"
.end
The output of the program is:
3.142857, Programming, 3.142857
As we discussed above, you're required to keep track of all your registers in PASM. In PIR, you can use "temporary" registers, so you no longer have to worry about the actual registers business. By specifying a $ in front of a register, we don't know which of the "physical" registers that's being used, just the type. In addition to temporary registers, you can declare named registers, which are effectively subroutine-specific variables. They don't necessarily correspond to high level language variables(Those are more likely to be declared as lexicals or globals.). We can see the trace of the Byte Codes of the above program to know more about the actual offsets in Byte Codes, the actual opcode used (of the Software-CPU), value of the registers before opcode execution etc. The command for trace is:
parrot -t 1 filename.pir (or filename.pbc)
The trace of regs.pir is shown below :
0 set I30, 7 - I30=0
3 set N30, 3.14286 - N30=0.000000,
6 set S30, "Programming" - ,
9 set N15, 3.14286 - N15=0.000000,
12 print I30 - I30=7
14 print ", " - N30=3.142857
16 print N30
18 print ", "
20 print S30 - S30="Programming"
22 print ", "
24 print N15 - N15=3.142857
26 print "\n"
28 null I0 - I0=0
30 null I3 - I3=1
32 returncc
We can see clearly the actual registers used are I30, N30 and S30 and the assignment operator is replaced by set opcode. We can also see the registers values before the opcode execution ( like - I30=0 etc. ). Decision making is an important and necessary feature when you implement something. So our next example is related to the branching and conditionals capabilities of 'Parrot'. The branch operator takes a label. Labels are presented at the start of the line with a colon followed by the conditional operators. The following conditional operators are available: eq (equal), ne (not equal), lt (less than), le (less than or equal), gt (greater than), ge (greater than or equal). They take two integer or numeric registers and a label to jump to if the condition is true. If the condition is false, then execution goes to the next statement. We implement the decision making in our next example using these operators.
brn ch_condtls.pasm
set I1, 1
DAYS: gt I1, 7, END
print I1
print "Day"
print " "
inc I1
branch DAYS
END: print "\n"
print "A Week"
print "\n"
end
The output of the program is:
1Day 2Day 3Day 4Day 5Day 6Day 7Day
A Week
The PIR version looks like:
brn ch_condtls.pir
.sub loop
.local int counter
counter = 1
DAYS: if counter > 7 goto END
print counter
print "Day"
print " "
inc counter
goto DAYS
END:
print "\nA Week\n"
end
.end
Although PIR does not provide high level language constructs like loops but with its conditional handling, we can very easily generate them. Our next example is regarding the subroutines.
subr outine.pasm
bsr BYE
end
BYE:
print "Goodbye Cruel World\n"
ret
The subroutine above doesn't pass or return arguments. The bsr (Branch SubRoutine) makes jump to the label mentioned next to it and ret (Return) makes return from a subroutine. The output of the program is:
Goodbye Cruel World
Now we take our last, but not the least, example. As mentioned above, opposite to 'Parrot' register architecture, the JVM has stack architecture. In basic words, stack is a data structure which have 'Last In First Out' mechanism. So now we implement a stack in 'Parrot' using save (PUSH operation) and restore (POP operation) constructs. The program is like:
stack.pasm
set I1, 1
save I1
print I1
print " "
set I1, 2
save I1
print I1
print " "
set I1, 3
save I1
print I1
print "\n"
restore I1
print I1
print " "
restore I1
print I1
print " "
restore I1
print I1
print " "
print "\n"
end
The output is :
1 2 3
3 2 1
The above presented examples were just to provide a feeling of the programming for 'Parrot'. In fact 'Parrot' provides a lot of programming functionalities and to explore it fully, the docs and examples subdirectories (in the parrot directory) are the perfect place to start.
Portable Programming Languages Using "Parrot"
As we have learnt that 'Virtual Machines' provide portability across the different machine architectures. So if someone wants to design a portable programming language that provides "Write Once, Run Anywhere" model then 'Parrot' fulfills this purpose very intelligently. In fact you can study about many concept languages written for 'Parrot' runtime in the 'languages' subdirectory.
Let's now discuss more about using 'Parrot' as a run time engine for custom designed portable languages. First we have to define a language with all its constructs and grammar, then we design a parser that takes source code of the language and creates a syntax tree of that source. As described above, we can convert the output of the parser in PAST format and then use 'Parrot' as runtime engine for our custom portable language. Although it's very simple layout of the total plan because we have to implement the optimization at the various stages also. This total scheme is shown in Fig-3.

Click to see the full image
I hope to write a complete article on designing custom portable language using 'Parrot' runtime, sometime in the future.
Conclusion
In this article we have explored the basics of 'Virtual Machines' and realized that how powerful this concept is for portability across different platforms. We also explored the intricacies of an intelligently designed 'Parrot' VM. So now we can look forward to use this VM for customized portable applications development.
References
www.cs.gmu.edu/cne/itcore/virtualmachine/
www.parrotcode.org
Author's Information
I'm Ankur Kumar Sharma, an Engineer from India. I'm passionate about Free Open Source Software (FOSS) and hacking GNU/Linux is my hobby. My contact address is:
ankur_k_s@yahoo.com