Tuesday, November 27, 2012

Node.js meet IBM PureApplication System – Part 2 of 3


This is a re-post from the on on the Expert integrated systems blog.  Please go there for discussions and feedback.

In this second of three posts on Node.js I will complete the implementation for the plug-in we designed in part oneand build and test it on an IBM PureApplication System.  In doing so I will cover the details of the process of creating PureApplication System plug-ins, as well as using them.

Recap

In part one of this series, we looked at an overview of the plug-in mechanism of IBM PureApplication Systems and how we could add support of the new OSS web application stack: Node.js.
As I noted, Node.js has some interesting architectural and implementation decisions which might make it more performant for some classes of workloads.  I’ve also identified a minimal set of attributes (three) needed to create a Node.js cloud component so that we can support Node workloads in IBM PureApplication systems.
If you did not read or don’t recall the details from the previous post now might be a good time to go and peruse it.  In this post I am going to complete the plug-in realizing this Node.js component and build and test it on an installation of IBM PureApplication System.

Creating the Node.js plug-in

Now that we have an idea of how we want to structure our Node.js pattern, as well as which plug-in should we create and what attributes it needs to have; let’s deep dive and create the plug-in.

Directory structure

As mentioned in the previous post, PureApplication System’s plug-ins have a strict form as far as directory structure and some of the files located herein.
At the root of each plug-in are two manifest directories with files: META-INF/MANIFEST.MF and OSGI-INF/node_jsXForm.xml describing the plug-in.  Additionally, in the root directory there is a build.plugin.xml file which is an Apache ANT file to build the plug-in.  We will not concern ourselves with these files as they contain mostly boiler plate content that are similar to all other IBM PureApplication System plug-ins.
From the various files and directories in the resulting plug-in, we mainly should be concerned with two directories: “plugin” and “src.”  The “plugin” directory contains the files for the plug-in of which two files and two directories are mainly important: plugin/appmodel/metadata.json, plugin/config.json, plugin/parts/node_js.scripts, and the binary files in the directory plugin/parts/**.  The “src” directory optionally contains Java source files and Apache Velocity template files for further customization, in this plug-in I will not use this feature.
The remaining files can be mostly copied from existing example plug-in with minimal modifications.  Let’s take each one of the files I modified in turn and discuss its content and the changes needed to make for Node.js.

plugin/appmodel/metadata.json

The metadata.json file located in the “plugin” directory is the key file providing most of the plug-in’s description.  This includes its unique identification and all associated information such as name, description, images and help files.
The bottom section of the metadata.json includes the list of attributes that this plug-in supports.  For each attribute you not only list its name, description, but also it’s type (like string
or integer or Boolean) as well as additional information such as help hover string and an optional regular expression to validate input and a message string when the input data is incorrect.
As I mentioned last time, the IBM PureApplication System pattern designer web tool parses the metadata.json for a plug-in to create a UI stylesheet containing the attributes listed for the plug-in.  This stylesheet lists all attributes and using the attribute’s type and other information can provide help to the user as far as the possible valid values for the attribute.
 The picture above shows the stylesheet for the Node.js plug-in.  Note how the version attribute includes a drop-down list for the values since these version values are fixed and noted as a list in the metadata.json section for the Node.js version attribute.

plugin/config.json

When a plug-in is used to create a pattern in IBM PureApplication System, the pattern can be used and instantiated as a concrete deployment.  Typically the user must provide missing information.  For the Node.js plug-in most patterns using it will need to be concretized by the user providing the URL to the Git project where the application to deploy resides. 
The other attributes having defaults means that the patterns can typically be instantiated using defaults.  However, how does the system know which virtual machine (VM) should be created, how many, and where each plug-in should be deployed on?
While the IBM PureApplication System is usually able to automate the deployment of most patterns, it uses hints from the plug-in’s plugin/config.json file as its primary means to find the affinity of the plug-in for various VM parameters.  For instance, if the plug-in requires some specific CPU or memory lower limits then this can be specified in the config.json file.
Other information such as the image for the VM as well as the VM type can also be specified.  For our Node.js plugin, we will simply require the typical image which contains the default Linux operating system and most dependencies we need for Node.js.

plugin/parts/node_js.scripts

Every plug-in will eventually be deployed when used as part of a pattern.  The deployment process is where the work of most of the plug-in is executed.  The deployment is essentially a flow that includes well known sequences where each plug-in gets to be configured, installed and started.  As you may have now guessed, these plug-in steps are achieved by executing the pattern’s configure.py, install.py and optionally the start.py and stop.py when stopped.
The configure.py script is usually reserved to setup, configure and download dependencies for the plug-in.  In our case, this is where we install PCRE (Perl Compatible Regular Expression) which is a library that we need to run Node.js on the Linux VM that is created.  In the install.py we divide the installation into three parts:
  • 1.Download and install the appropriate version of Node.js directly from Github.com.  This is shown in line 37 which calls a shell script to complete the installation (shown on screenshot below).  In the shell script we get the parameters and download the correct Node.js version in lines 21 to 23.
  • 2.Similarly, in the Python script we download and install Git and NPM (Node Package Manager) are lines 29 and 39
  • 3.Download the application from the Git repository, lines 44 to 46.
The start.py is used as expected to start the application and the app might have various NPM packaged dependencies, we first run NPM  and then start the Node.js application server in the app directory.  Deciding if NPM should be run is done using the Run NPM attributes of the plug-in.  The basic command is something like:
exec /usr/bin/nohup /usr/local/bin/node $WORKING_DIR/$NODE_APP_NAME 2>&1 > /dev/null &
This assumes that environment variables are set for the working directory and the Node.js application name—passed as a user parameter from the plug-in.

plugin/parts/**

In the plugin/parts directory and subdirectories we keep actual installation tarball files for PCRE, NPM and Node.js.  This allows the plug-in to install in case the VM does not have access to the Internet, or usually, if a failure occurs when the download is attempted.  In general, it’s a good idea to keep all dependent files that are needed for the plug-in to correctly setup and install.  In our case, the correct version of PCRE is contained therein since the rest of the Node.js setup depends on it to be installed on the Linux environment.

Building the Node.js plugin

Building the plug-in requires Java 6 and Apache ANT.  Java is needed since Apache ANT is a Java tool.  Running the build is easily done with one command:
$ant -f plugin.build.xml
A correct installation of the Plugin Development Kit (PDK) means that the dependencies and other associated Java JARs will be on your CLASSPATH.  The ANT command, if successful, results in a tarball file created in the export directory.  This is the plug-in.  In our case this file is export/node_js-1.0.0.0.tgz.  The version is controlled by the value in the plugin/config.json.  Changing that will result in a new version for the plug-in.

Deploying and testing the Node.js plugin

Now that we’ve created our Node.js plug-in and also discussed how to build the plug-in, one thing remains: how to use it?  In this section we aim to answer this question by first describing how to install the plug-in into an IBM PureApplication System and use the plug-in to create a simple Node.js pattern.  From there we will test the plug-in by deploying the pattern.

Adding plug-in to IBM PureApplication System

Once your plug-in is built the next step, before you can create patterns to test and use it, is to deploy the plug-in into an IBM PureApplication System.  This can be achieved in one of two ways:
1.If you have access to the IBM PureApplication System controller node via SSH then you can add
and remove plug-in by executing the shell scripts: /opt/IBM//plugin_add.sh and /opt/IBM//plugin_remove.sh.  The add_plugin command takes as argument the tarball file of the plug-in.  For the remove_plugin command you pass the name and version of the plug-in to remove.
2.Alternatively, there is a web UI interface to manage plug-ins.  This can be accessed in the admin section of the IBM PureApplication System dashboard.  The plug-ins are listed and new plug-ins are added by using the web interface and uploading the tarball file for the plug-in.
When the plug-in is successfully added to the IBM PureApplication System plug-in catalog it will then show up in the tool palette on the left hand side when you attempt to create a new pattern or edit an existing pattern.

Creating a Node.js pattern

Now that the plug-in is installed the next step is to create a simple pattern to deploy Node.js application.  Doing so is simple using the drag-and-drop interface of the IBM PureApplication System pattern designer application.  You launch the application clicking on Pattern -> Virtual Applications -> New (+ button) and selecting the “Blank Application” option so that the pattern starts with a blank sheet.
Once the pattern builder application is loaded, you simply need to drag-and-drop the Node.js cloud component from the left hand side list to the canvas.  Selecting the component shows its stylesheet.  There you can select the default Node.js version to use as well as that NPM should be run to download dependencies.  Save the resulting as a pattern giving it the name: Node.js pattern.

Testing a non-trivial Node application

Testing this new pattern is relatively easy.  You now need to create a new cloud application using the Node.js pattern which shows up in the list of available patterns.  What remains is to give the application a name and point to the Git repository.  For testing purposes we will deploy the Nodejistsu Prenup, a collaborative Behavior Driven Development (BDD) application found on Github.com at: https://github.com/nodejitsu/prenup.
Prenup is a pure Node.js OSS application that facilitates Behavior-Driven Development (BDD) by creating a collaborative engagement between developers and clients.  Once the cloud application is saved, it will show up in the list of cloud applications.  All that remains is to click deploy.  Once deployed, you can then access the details of the deployment, including the status of the VM, its log files and also a link to the deployed application.  From there you can manage the deployment, such as stopping or deleting it.

What Next?

In this post we completed the Node.js plug-in that we designed in the first post of this series.  In doing so we covered in detail aspects of the plug-in and also showed how to build and install this plug-in in an IBM PureApplication System.  Using the plug-in we created a simple Node.js pattern and used it to deploy the OSSNodejistsu Prenup Node.js  app on Github.com.
Next we will complete this series by discussing what can go wrong in the process so far and give hints for debugging and testing your plug-ins.  Additionally, we will also briefly highlight some advanced features, such as quality of service (QoS) and link plug-ins which can be used to extend this current plug-in and make it more functional, for instance, using a link plug-in to create a connection between Node.js and a database or adding scaling QoS properties to the current plug-in.

Wednesday, August 1, 2012

Node.js meet IBM PureApplication System – Part 1 of 3

This is a re-post from the on on the Expert integrated systems blog.  Please go there for discussions and feedback.


In this three-part series I will explore how you can create IBM PureApplication Systems patterns for the “hot” open source web technology: Node.js.  In doing so, I will revisit the basics and the details behind creating PureApplication System plug-ins (extensions) as well as how to use them to create cloud application patterns.
The paramount goal of this series is to create more example plug-ins for the PureApplication System and contribute to the IBM PureSystem’s growing ecosystem.  Essentially, this is reinforcing the points we have been making in this blog about the openness of IBM PureSystems, its extensibility, its ease of use, and simply how it is transforming enterprise IT into the cloud.
Extending IBM PureApplication Systems
The IBM PureApplication System is the platform as a service (PaaS) platform for the IBM PureSystems family of products.  As I mentioned in a previous post (Should you virtualize your pattern or not?) PureApplication System has two means to define workloads: virtual applications (vApp) and virtual systems (vSys).  Each model has a set of extensibility points.  In this post I focus on vApp since that is the one with the most flexibility.
To create a vApp pattern one has to reuse or create plug-ins that represent the cloud components, cloud services, and the links constituting the pattern.  The IBM PureApplication System comes with a multitude of plug-ins supporting the patterns that are available right of the box, such as J2EE applications, web-based applications, transactional applications, and other enterprise-oriented patterns.  There is also a growing ecosystem of patterns available in the IBM PureSystems Marketplace, where third party business partners have created plug-ins for additional components and made them available.
For this post we will focus on creating a plug-in for Node.js to support typical patterns of deployments of this new web application stack.  This will be done in three parts and will cover all aspects of the designing, implementing, building and deploying plug-ins as well as using them to create patterns, and of course using the resulting patterns.  With that, let’s jump right into the details, starting with a refresher overview of vApp plug-ins and then an overview of Node.js so we can have a good idea of what type of patterns we want to create, which will help define the design goals for our plug-in.
Anatomy of a vApp plug-in
In the IBM PureApplication System, patterns are composed (visually) using cloud components, cloud services, and links.  Cloud components represent a part of a software stack, such as the WebSphere Application Server, the DB2 database, or the Ruby on Rails application server, and others.  A cloud service is a component that is shared across deployments, for instance, the IBM MQ service is a cloud service that you can add to your pattern.  Finally, links represent connections among cloud components and between cloud components and cloud services.
Creating a new plug-in amounts to deciding where it logically fits into the categorization above.  Sometimes this means creating multiple plug-ins to address the fact that a component might need to be connected to other components; in this case, you would create a link plug-in to connect the components.
Once decided, the next important step is to decide the attributes that the component (or link) exposes and the type and values that each attribute can take.  For instance, for a Node.js component, a sensible set of initial attributes might be:
  • the name of the application
  • the version of Node.js to use
  • the URL to the git repository containing the Node.js application
Naturally, many more attributes are possible; however, these three make for an easy start.
The IBM PureApplication System plug-in model is very systematic and uses a strict structure for all aspects of the plug-in.  In short, the structure captures the following four key items:
  • metadata.json – meta data information about the plug-in, such as: name, description, help files, image files, and the list of attributes exposed along with the attributes’ types.
  • config.json – contains additional configuration information to optionally give hints to the PureApplication System deployment process.
  • scripts – contains the scripts that will configure and install the cloud component that this plug-in represents.  Additional management scripts include those to stop, restart, and so on.
  • files – all supporting files for the cloud component, for instance, binary installation files and configuration files.

The plug-in’s UI is represented by the icon image set in the metadata.json.  The IBM PureApplication System pattern designer web application also automatically creates the appropriate remaining UI for the plug-in using the meta data information.  This includes providing property sheets for the plug-in’s attributes as well as showing help files and enabling policies and links, if these are enabled and loaded.
The final part of a plug-in are the optional links and policies.  Links represent the connections between a cloud component and another cloud component or a cloud component and a cloud service.  When a cloud component supports a link then it is added to its metadata.  Similarly, quality of service (QoS) policies are used to provide non-functional capabilities to a cloud component.  For instance, specifying how a cloud component scales by horizontal replications or by growing its instance resources, such as CPU or memory, vertically.
Links and policies constitute advanced features of the IBM PureApplication System plug-in architecture and I may discuss them in a future post.  For now let’s focus on getting Node.js and the IBM PureApplication System acquainted.
Introduction to Node.js
Node.js is the “new hotness” in web application development.  But what makes Node.js interesting to web developers and what is the craze all about? And how does it compare to mature platforms such as JavaEE and Ruby on Rails?
Implementing an application server, in its simplest form, amounts to implementing a daemon process that executes client requests as they arrive.  The requests need to be parsed, routed to appropriate services, executed, and responded to.  The faster this cycle can be achieved, the more requests can be managed, and thereby the better performant is the resulting application server.

Naturally, requests sometimes fail and also need to be secured and isolated.  These complications, and others, imply that designing an application server usually


results in using OS-level resources that are designed for concurrency control while providing some level of security and isolation.  As such, most application servers use OS processes or more frequently OS threads as a basis for their core implementation.  Such is the architectures of most Java-based application servers.
However, while OS processes and threads make it easy to create heavily concurrent applications and servers, they also have drawbacks.  Primarily, thread-based application servers (and even more so for process-based servers) tend to be heavyweight, that is, they require lots of system resources: CPU, memory, and storage.  This can become a significant issue when running application servers that have to deal with significant number of concurrent requests or when these requests last a long time or perform tricky computation, such as complicated database queries.
Node.js is designed to deal specifically with this “heavyweight” implementation consequence of threads-based servers.  The distinguishing characteristic of the Node.js application server, which gives it its “lightweight” implementation claim is that instead of using OS-level mechanism to deal with multiple requests, it instead uses application-level callbacks and non-blocking IO libraries.  The JavaScript language uses callbacks as an essential mechanism for modularization and extension, and Node.js, as it is implemented in JavaScript, uses the same callback mechanism.
Another interesting aspect of Node.js is that it embeds the highly performing Google V8 JavaScript engine.  The V8 engine has revolutionized JavaScript virtual machines by utilizing advanced techniques pioneered in other interpreted languages such as Smalltalk and Self. Some of this includes aggressive optimization like generational incremental garbage collector, Just in Time compilation (JIT), and inline caching of functions. The result is a virtual machine that makes JavaScript a worthy competitor to other interpreted languages and even (in some cases) compiled languages.
Additional benefits of Node.js are that as web application programming interfaces (APIs) moved to using JavaScript Object Notation (JSON) as their primary data interchange format, most web services now include a back end that exposes a JSON-based API with a front end that uses the JSON data to create the user interface.  Creating the application using Node.js allows both the back end and front end to be in the same language, thus potentially simplifying the codebase and associated assets, such as, data validators, tests code, tests data, and so on.
Node.js plugin goals
The primary goal for our Node.js PureApplication System plug-in is to allow Node.js applications to be simply and quickly installed, ran, and managed on an IBM PureApplication System setup.  We will assume that the Node.js application is located on a Git repository, such as Github.com, and that it uses N
ode 
Package Manager (NPM) to manage dependencies. The latest version of Node.js will be added into the plug-in and with a version attribute, a different version can be selected and used instead.
To keep the plug-in simple, we will not deal with typical scaling features of Node.js applications such as using aNginx HTTP/reverse proxy server, a load balancing service, or data caching service
s
.  These additional features can be added to this current plug-in in the future and are left as an exercise to the reader.
What next?
In part 2 of this post, I will complete the Node.js plug-in and deploy a non-trivial open source Node.js application found on Github.com.  I will also discuss how you can build the plug-in and install it into an IBM PureApplication System setup.