InsideVault

The IV Tech Blog

A Gentle Introduction to Functional Programming with Scala

Startups must be nimble and change direction on a dime when circumstances demand it. About five months ago we abandoned most of our back-end code written in Java and Groovy, and re-implemented it in Scala. Why? Let me tell you.

Java was a breakthrough 15 years ago, with its clean object-oriented structure, cross-platform portability, garbage collection that releases programmers from the shackles of malloc() and free(), and reasonably good performance for a non-native language. However, it often requires excessively verbose code due to its rigid syntax and boilerplate conventions (like getters and setters). It’s all too easy to lose the concepts you’re really trying to express amidst all that noise, and I’m all too familiar with that sinking feeling of impending tedium when I realize that I have to write a bunch more boilerplate code to solve a seemingly simple problem (or even worse have to change a bunch of boilerplate code when I realize I’ve made a mistake).

Groovy improved on Java in several ways while retaining compatibility with industry-standard JVMs. It has a more dynamic and expressive syntax, and it introduces some functional programming concepts like functions as first-class objects. In case you’re not familiar with the latter, it allow you to pass a function around like any other Object so that it can be executed at some other time in some other context. Sure you can do this in Java with something like the Strategy pattern, but at the cost of more boilerplate code.  Also, Groovy is the core language for the Grails web framework, which greatly simplified our early efforts to prototype our SAAS application.

However, we outgrew this technology choice as we fleshed out more of our platform. Based on some informal performance tests we did, Groovy code executes 4-40x slower than Java. We’re certainly not the first to make this observation (notice that this link shows Scala performance on par with Java). Even worse, the dynamic typing that contributes to a pleasant Groovy programming experience, also translates into a nightmare for maintaining and testing a large application. There are a lot of bugs that can slip past its lax compiler and not reveal themselves until their code is executed, and this is unacceptable for a large, complex application.

Enter Scala, another 100% JVM-compatible language. Iceman has been advocating for Scala ever since… well his interview with us. And frankly it’s the best language I’ve learned yet. It has an expressive syntax like Groovy, but it’s backed up by a strong, static typing system that won’t let you contradict yourself later on. For example, when you assign var foo = 5, the smart Scala compiler infers that foo is of type Int (like a cross between Java’s int and Integer, without the autoboxing overhead). If you later try to reassign foo = “some string”, the compiler will throw a type mismatch error.

But there’s so much more to Scala than this toy example, far more than I can cover in a short article. It’s core advantage is its seamless blending of object-oriented and functional programming concepts. I assume that everyone is familiar with object-oriented programming, which has been the bread-and-butter of professional programming for decades. However, functional programming has always been somewhat esoteric, especially for those of us with non-traditional CS backgrounds. My eyes began to open when I heard an impassioned pitch for functional programming concepts at the 2008 International Conference on Autonomic Computing, but I didn’t really do much with it for four years except make a mental note that “I should learn more about that sometime”.

In a nutshell, functional programming is about corralling your code that does stuff to a small number of well-defined places. Examples include hitting the database, logging, and communicating with the AdWords and Bing APIs. The majority of your code should be pure functions, doing nothing more than taking input parameters and spitting out results. They don’t touch the input parameters (immutable values and collections enforce this), they don’t keep state, and they don’t even so much as write “Boo!” in the log (although they can return an intention for some other code to write “Boo!” in the log).

There are several advantages to this. One is that a function will always return the same result for the same input parameters. That makes it super easy to test and debug. Another is that it’s easy to parallelize; no shared state means no need for complex locking schemes and the risk of deadlocking and resource contention. You can save those mental pretzel-makers for the small amount of code that actually does stuff, and spend the rest of your time focusing on the algorithms for how to calculate what needs to be done. These algorithms are probably your real value-add anyway.

A great capsule summary I read about object-oriented versus functional programming is that object-oriented is focused on the nouns, whereas functional is focused on the verbs. Obviously both are important in a large, complex project, and which one you want to highlight depends on the problem you’re trying to solve. When describing the plethora of objects in the AdWords API and how we handle each of them in our campaign management code, I took an object-oriented approach. Nouns, nouns, nouns. There were a lot of nouns. By contrast, when I ported the code for our automated keyword grouping algorithm to Scala, I favored the functional paradigm. It was mostly about the verbs for identifying similar keywords.

Given that functional programming is all about the verbs, I’d like to wrap up with a few of my favorite Scala verbs. These are defined on all of the Scala Collections classes, from Traversable (like Iterable, but restricted to “internal iteration”) all the way down the inheritance hierarchy. Without further ado:

  1. filter: Takes a function that processes each element of the collection and returns true or false. The result is the subset of elements for which the input function is true. Simple and sweet! There’s also a filterNot as a convenient complement.
  2. partition: What if you want to keep both classes of elements (true and false)? partition returns a pair of collections: the first is everyone for which the input function returns true, and the second contains the remaining falsies.
  3. groupBy: Taking another step up the abstraction hierarchy, this accepts a function that processes each element of the collection and returns a value of some arbitrary type (you get to pick the type, so the compiler can yell at you later if you contradict yourself). What’s done with this return value? It becomes a key for a Map that groupBy returns, and its corresponding value is the subset of elements for which the input function returned that key. Think of groupBy as a n-way partition.
  4. flatten: Let’s say you want to go the opposite direction: from the Map back to the original collection. For starters, you could call fooMap.map(_._2). Obvious, huh? (just kidding) The map function shouldn’t be confused with the Map collection. It’s more like the map in map-reduce, but not exactly the same. In this example, it takes a list of key-value pairs from the Map, and returns just the values (position 2 of the key-value 2-tuple). But we still have a problem. The result is a collection of collections, such as a List[List[Element]]. To get it back to the way it was, we call flatten on it, which frees the Elements from the second layer of Lists, and returns a simple List[Element]. The flatten and closely related flatMap functions are also the foundation for the powerful and advanced magic of monads. Sehr kuhl!
  5. foldLeft and foldRight: These are the swiss army knives of Scala Collections functions. You can basically do whatever you want with these functions. They’re a bit like the reduce part of map-reduce (and Scala Collections has closely related reduceLeft and reduceRight functions). The fold functions take two parameters: an initial starting value (zero, an empty list, a Siamese cat, whatever you like), and an input function. The input function in turn takes an element from the collection and an object of the same type as your starting value, and it returns another object of that same type. The first time it’s called, it receives the first element from the collection and the initial starting value itself. Where it goes from there is up to the input function. Most problems can be solved more simply with one of the previous functions, but when you’ve got a tough nut to crack, foldLeft and foldRight are a tough pair of nutcrackers. Use them wisely.

Hopefully this article lived up to its title and provided you with a gentle introduction to Scala and its application of functional programming concepts. Please let me know what you think and if there are any aspects of Scala or functional programming that I could drill into more deeply with my next article.

Practical AJAX Error Handling

Error handling is an area that is often overlooked and least tested, because in a lot of cases errors are not supposed to happen – or if they do happen, they do not happen that often. The example below shows you how to handle AJAX request errors in a practical way. It is written using ExtJS framework, but conceptually it should not make any difference to write it in any JS framework.


Ext.Ajax.request({
  url:'/myDestinationUrl/myAction',
  success:function (response, opts) {
    var parsedResponse = JSON.parse(response.responseText);
    if (parsedResponse.success) {
      // (1) success! do my stuff
    } else {
      // (2) the server returns success = false - display error message (the error message is in parsedResponse.message)
    }
  },
  failure:function (response, opts) {
    // (3) request failure - display error message
  }
}); 

Basically there are three branch points that we have to take care:

(1) If the server is returning HTTP 200 and success response.
(2) If the server is returning HTTP 200 and failed response.
(3) If the server is returning HTTP 400/500 or does not respond to the request (timeout).

On the server side, this is an example code how those three branch points correspond to (branch #3 means anything that cannot be captured by the ‘catch’ block or anything that does not even reach the server at all):


def myAction = {
  try {
    // do my stuff
    ...    
    [success: true] (1)
  } catch (Exception e) {
    ...
    [success: false, message: 'error message'] (2)
  }
}

Transitioning from RightScale to Chef with Amazon CloudFormation in Amazon’s AWS – Part 1 of 3

This is part one in a three part blog, initially covering our move to Chef, moving to a mid-transition summary of our thoughts and experiences during the process, and concluding with the results.

—-

Building a high performance highly redundant infrastructure in a very short period of time can be a difficult task in a startup with a relatively small number of IT personnel. At InsideVault, we attempted to push the envelope, deploying both our high performance production systems and office support applications in the cloud as quickly as possible.

The cloud excels at providing a low cost entry point, with no significant initial infrastructure investment, an “as you need it” pricing model, and the ability to change up the type of resources in use very quickly with no additional costs. And relatively reliable data storage is often part of the package. The management tools have been rather minimal in the past but this has been changing as companies like Amazon increasingly deploy additional tools that either ease administration or provide new services.

While Amazon has been adding tools such as CloudFormation to allow for increased automation and ease of deployment, there is still a fair amount of IT time and resources required to get a production ready environment off the ground. Automation, backups, redundancy, and other niceties such as version tracking all require additional time to implement, longer if the IT group responsible for the environment is in the process of coming up to speed on best practices for a cloud environment.

One of the tools that made it possible to ramp up a production ready system in a very short period of time was RightScale, which is an online cloud management platform. RightScale provides a number of templates that make it very easy to piece together relatively generic stacks and deployments (such as LAMP). With RighScale you can grab a few templates, plug in the inputs, hit a button, and you have a fully redundant load-balanced web server environment with regular backups and the ability to easily create auto deploying arrays of additional web servers in case of high load. The templates are not always completely ideal, but they provide the option to completely customize or even build them from scratch, and the built-in versioning system works well.

There are some downsides, however.

RightScale templates require that a number of dependencies be met within the environment to allow their tools access to the server environment. This can limit the options a company has when attempting to implement tight security over ports or user controls in production. Database and application deployments can also have user configuration or folder structures that are less than ideal and difficult to change due to the nature of the RightScale template scripts and dependencies.

Cost can be another issue. Startup companies deploying open source products to the cloud are looking to keep initial costs very low. RightScale comes at a cost that could be considered rather steep for a small company that is still ramping up, especially relative to the cost of the applications being deployed if the environment is primarily open source. For a small to medium sized environment, the cost can be justified fairly easily by the reduction in IT resource requirements, but the problem occurs when you have a larger environment and a small lean running company.

At InsideVault, we have sharded NoSQL databases in both production and staging environments that are rapidly reaching the multi-terabyte mark, a large number of standard and proprietary application servers, and redundancy across the board, each requiring a fairly high server count both in production and in staging. In addition, we have a fully functional deployment that simulates the kind of environment our customers might use, allowing us to get a feel for how well the environment works for real world day to day use. The amount of data processing we perform requires that a significant portion of our environment be deployed to some of the more powerful instance sizes that Amazon provides in their EC2 environment. All of these result in a fairly high server and CPU count for a small company.

RightScale has several pricing tiers. The lowest one could be considered quite affordable considering the services offered, but RightScale begins charging a fairly significant premium per CPU counts above and beyond the base allowance (Higher level plans support an increased number of EC2 Compute Units). The overage model resembles cell phone overage charges to some degree, and in our case rapidly quadrupled the associated RightScale costs. This left us with two options: switch to a more expensive RightScale plan that would allow for increased Compute Units, or find an alternate deployment solution.

As we now had a stable functional environment, additional IT resources were available to deploy an open source solution, so we decided to move to an alternate method of deploying our servers. The two most attractive options were Chef (Opscode) and Puppet (Puppet Labs). After a significant amount of research into the tradeoffs and benefits of each of these, the decision was made to go with Chef. In addition to having some features we found ideal, Chef was also tightly integrated into RightScale, resulting in some experience with Chef deployments already existing in-house.

Thus begins our journey.

Chef offers a significant number of benefits and features for deploying to a cloud environment. But certain forms of automation and failure handling can be difficult to implement, especially during the initial deployment. To further add to our toolbox, we shall be making use of a relatively new service provided by Amazon. CloudFormation provides a number of tools that can work together with Chef to assist with automation and deployment.

Migrating an environment from a RightScale managed deployment to one managed by Chef and CloudFormation utilizing cloud best practices is not a process that appears to have been covered a great deal online. With this in mind, we felt it would potentially be informative to blog about our experience with this transition, starting with the planning and initial test phase, moving through the (bumpy) road of the transition, and completing with a section covering the results, how it meets or misses our expectations, potential pitfalls, and general thoughts on the process.

Should there be interest, we may also publish some additional blogs covering the Chef/CloudFormation integration, details of Chef/Amazon EC2 deployment, and Sharded Mongo failure handling. And while research was done on the Chef vs. Puppet debate, we will probably not be publishing the results of this research as many sites already provide various thoughts on this topic (though occasionally rather biased), and we do not have enough hands-on experience with Puppet to provide valuable input.

Please stay tuned for part two…

Maverick

Link button in Ext JS

Here at InsideVault we use Ext JS JavaScript framework for our front-end. One of the great things about Ext JS is how easy it is to extend it to create custom components.

For our application, there are numerous occasions where we need a hyperlink to act like a button. That is, rather than navigating to a new page, when clicked, the link must perform some other action – opening a new EXT JS Window component, for example.
One way to do this is to use an Ext JS container component.

For example, we can add this to a panel.


{
  xtype:'container',
  html:'<a href="/ext/keywords">Keywords</a>'
}

Then in a controller we can we can listen for the ‘render’ event on that panel, and set up a listener for the click event on the link. Like this -

 
panel.getEl().on('click', function (e, t) {
 e.preventDefault(); // prevent navigation to the link location

 if (t.tagName == 'A') {
   if (~t.pathname.indexOf('/ext/keywords')) {
      // perform some action
   }
   else if (~t.pathname.indexOf('/ext/settings')) {
      // perform some action
   }
   else if (~t.pathname.indexOf('/ext/ads')) {
     // perform some action
   }
 }
});

As you can see, for each link, we need to check the path and then perform appropriate action. If there are many such links, this can become a hassle as these code fragments will clutter the codebase.

An elegant solution to this problem is to extend the Ext JS container, and define a custom component. Like this –


/**
 * Component for creating links that act like buttons.
 * Usage:
 *
 * ui component
 *  {
 *     xtype: 'linkbutton',
 *     text: 'Click Me',
 *     itemId: clickLink
 *  }
 *
 *
 * listener
 *  #clickLink  or (linkbutton[itemId=clickLink]): {
 *      linkclick: function()  {
 *          // do something
 *      }
 *  }
 */

Ext.define('LinkButton', {
  extend: 'Ext.container.Container',
  alias: 'widget.linkbutton',
  config: {
    text: ''
  },

  cls: 'link-button',  // use this to control how the link appears in the UI

  initComponent: function () {
    var me = this;
    this.html =  '<a href=\"#\">' +this.getText()+ '</a>'

    this.renderData = {
      text: this.getText()
    };
    me.callParent(arguments);

  },
  listeners: {
    render: function(cmp) {
      cmp.getEl().child('a').on('click', function(e){
        e.preventDefault()
        cmp.fireEvent('linkclick', cmp);
      }, cmp);
    }
  }
});

Now we can create link buttons just like ordinary buttons!
All we need to do is to set up a link like this –


{
  xtype: 'linkbutton',
  text: 'Click Me',
  itemId: clickLink
}

And then listen for the ‘linkclick’ event in a controller.


'panel linkbutton[itemId=clickLink]': {
   linkclick: function() {//do something}
}

Walking on the wild side with JavaScript

Here at InsideVault, we are building the client side application using Ext JS 4. It is a great framework and it even manages to bring some order to the chaos of JavaScript.

JavaScript has been around for quite a few years now, but for much of its existence it has been used mostly as a minor appendage to HTML and CSS. The variation of JavaScript between browsers was a major block for the growth of the language. This was somewhat overcome by the arrival of libraries such as the infamous jQuery.

JavaScript is an extremely powerful dynamic language. However, with power comes responsibility and there in lies the problem. The language supports object-oriented, functional and imperative programming styles, but only in a very loose fashion.

For example, encapsulation is a fundamental part of the object-oriented programming style. In a language such as Java, encapsulation is achieved by using scope modifiers such as private and public.


public class Hello {
  public void printHelloPeter() {
    printHello(“Peter”);
  }

  public void printHelloPaul() {
    printHello(“Paul”);
  }

  private void printHello(name: String) {
    System.out.println(“Hello ” + name);
  }
}

For JavaScript, encapsulation is a far more convoluted process and in practice such detail is often left out. The most effective way to enforce scope privacy in JavaScript is to use a closure.


var hello = (function() {
  var printHello = function(name) {
    console.info(“Hello ” + name);
  };
  
  return {
    printHelloPeter: printHello(“Peter”),
    printHelloPaul: printHello(“Paul”)
  };
})();

One of the potential drawbacks with this approach is that it can eat up memory. So, care has to be taken to avoid it in certain situations and bite the bullet of poorly encapsulated code.

JavaScript just does not make it easy to do the right thing. In fact, all aspects of the object-oriented programming style is a strain to achieve in JavaScript. Several JavaScript libraries and frameworks try to address this shortfall and Ext JS 4 is one of those that does a great job of this.

For example, Ext JS 4 allow us to create a Hello class fairly easily.


Ext.define('Example.Hello', {
  constructor: function() {
    var printHello = function(name) {
      console.info(“Hello ” + name);
    };

    Ext.apply(this, {
      printHelloPeter: printHello(“Peter”),
      printHelloPaul: printHello(“Paul”)
    });

    this.callParent(arguments);
  }
});

However, you do have to follow the designated pattern. JavaScript does not force you to create classes in any particular way. So, why use such a care free language in a professional software development setting? Well, the fact is, when it comes to in-browser software development, it’s the only show in town, unless you want to force the user to download a plug-in. In any case, with some discipline, the results can be pretty impressive.

A similar technique can be applied when extending Ext JS 4 view components:


Ext.define('Example.HelloView', {
  extend: 'Ext.Panel',
  alias: 'widget.helloview',

  initComponent: function() {
    var printHello = function(name) {
      console.info(“Hello ” + name);
    };

    Ext.apply(this, {
      printHelloPeter: printHello(“Peter”),
      printHelloPaul: printHello(“Paul”)
    });
  
    this.callParent(arguments);
  }
});

So, why bother with encapsulation? In a perfect world, all methods would be public and the other programmers would respect comments indicating which methods should be accessed. Encapsulation protects the integrity of internal data, it allows the code to be more easily refactored as the system evolves, it reduces the amount of error-checking code, it also promotes the concept of loosely coupled modules, it increases reusability opportunities, it reduces the chances of namespace collisions and it reduces the chances of polluting the public namespace.

Jobs Contact Us  |   Blogs: Marketing Tech Company  |  
Copyright © InsideVault Inc. All rights reserved.  |  Privacy Policy