Saturday 18 August 2012

Functional Programming in PowerShell


Functional Programming in PowerShell

A while back I got introduced to the concept of functional programming and like most of you out there I found it hard to understand some of the concepts. Over time I have slowly started to understand some of the concepts and thought I would try to once for all capture them as I have a memory of a goldfish. Please note that it is important to try to understand these concepts if you are going to understand this blog post.

If you would like to learn more about it I suggest this great book titled Real-World Functional Programming.

Functional Programming is defined as:

A programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasises the application of functions, in contrast to the imperative programming style, which emphasises changes in state.

Functional Programming is defined as using a style called Declarative programming rather than your traditional Imperative programming. What do these mean styles mean?

Declarative Programming is defined as:

A programming paradigm that expresses the logic of a computation without describing its control flow. Many languages applying this style attempt to minimise or eliminate side effects by describing what the program should accomplish, rather than describing how to go about accomplishing it.

Imperative Programming is defined as:

In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state. In much the same way that imperative mood in natural languages expresses commands to take action, imperative programs define sequences of commands for the computer to perform.

The following points are what some of the functional programming languages exhibits:
  1. Immutable Object
  2. Higher Order Functions
  3. Currying
  4. Lazy Evaluation
  5. Continuations
  6. Pattern Matching
  7. Closures
Luckily for us is that

PowerShell is motivated by functional programming

Right so lets go through these points and how they apply to PowerShell.

Immutable Object

An immutable object is an object whose state cannot be modified after it is created. This is accomplished as follows:
Describe "Immutable Object" {
    It "Should create immutable object" {
        $immutable = New-ImmutableObject @{ Name = "test"}
        $immutable.Name.Should.Be("test")

        try {
           $immutable.Name = "test1"
           $false.Should.Be($true)
        }
        catch [System.Management.Automation.SetValueException] {
            $true.Should.Be($true)
        }
    }
}

function New-ImmutableObject($object) {
    $immutable = New-Object PSObject

    $object.Keys | %{ 
        $value = $object[$_]
        $closure = { $value }.GetNewClosure()
        $immutable | Add-Member -name $_ -memberType ScriptProperty -value $closure
    }
    
    return $immutable
}
By adding a property that is of type ScriptProperty it generates a getter. When we try to modify it it will throw a System.Management.Automation.SetValueException. The example uses a Hashtable as the input, however it is quite easy to extend it to use any type of object.

Higher Order Functions

A higher-order function is a function that does at least one of the following:
  • Take one or more functions as an input
  • Output a function

So lets look at a simple test that will work out even and odd values in an array:
Describe "Higher Order Functions" {
    $values = @(1, 2, 3, 4)

    It "Should filter even" {
        $evenPredicate = { param($value) return $value % 2 -eq 0 }
        $even = Convert-ByFilter $values $evenPredicate
        $even.Length.Should.Be(2)
        $even[0].Should.Be(2)
        $even[1].Should.Be(4)
    }

    It "Should filter odd" {
        $oddPredicate = { param($value) return $value % 2 -eq 1 }
        $odd = Convert-ByFilter $values $oddPredicate
        $odd.Length.Should.Be(2)
        $odd[0].Should.Be(1)
        $odd[1].Should.Be(3)
    }
}

function Convert-ByFilter($values, $predicate) {
    return $values | where { & $predicate $_ }
}
PowerShell has a concept of a script block. The important thing to notice is that we are using a where expression and we are invoking the predicate with an & (this took me a while to work out).

Currying

Currying is the technique of transforming a function that takes multiple arguments (or an n-tuple of arguments) in such a way that it can be called as a chain of functions each with a single argument.

This has proven to be a bit difficult in PowerShell however it is not impossible. Let's look at the example below:
Describe "Currying" {
    It "Should add two values via annonymous functions" {
        $add = { param($x) return { param($y) return $x + $y }.GetNewClosure() }
        $addFive = & $add 5
        $ten = & $addFive 5
        $ten.Should.Be(10)

        $ten = & (& $add 5) 5
        $ten.Should.Be(10)
    }

    It "Should add two values via named functions" {
        function add($x) { return { param($y) return $y + $x }.GetNewClosure() }
        $addFive = add 5

        $ten = & $addFive 5
        $ten.Should.Be(10)

        $ten = & (add 5) 5
        $ten.Should.Be(10)
    }
}
As we can see, a curried function can be created as an anonymous function or a named one. I unfortunately could not find a clean way to represent this as a generic solution. The trick here is to always return a new closure. There is a JavaScript implementation that might be able to be used in PowerShell that maybe someone could help me with.

Lazy Evaluation

Lazy Evaluation is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

This fortunately is easy to create as the wonderful people in the .NET framework created the class System.Lazy. Lets have a look at some PowerShell code:
Describe "Lazy" {
    $lazy = New-Lazy { return "test" }

    It "Should not have a value evaluated" {
        $lazy.IsValueCreated.Should.Be($false)
    }

    It "Should get lazy value" {
        $lazy.Value.Should.Be("test")
        $lazy.IsValueCreated.Should.Be($true)
    }
}

function New-Lazy($script) {
    $function = [System.Func[object]] $script
    $lazy = New-Object System.Lazy[object] $function

    return $lazy
}
All we need to do is create a Lazy object and call the Value property. The implementation will cache the value.

Continuations

A continuation is an abstract representation of the control state of a computer program. What does that mean?

This to me was complicated to understand why would you want such complication. I found a great explanation in the following article of how you can use continuations in a real world sense (nothing worst than thinking in abstract terms)

PowerShell does not have a feature like this (neither does .NET for that matter). The closest thing that PowerShell has is what is called Workflows. To look at how these are implemented I will need another blog post.

The other thing that we can look into is Continuation-passing style, which is described as:

A function written in continuation-passing style takes as an extra argument: an explicit "continuation" i.e. a function of one argument. When the CPS function has computed its result value, it "returns" it by calling the continuation function with this value as the argument.

This can be achieved with Task Parallel Library (TPL) or Async/Await, which neither work in PowerShell.

Pattern Matching

Pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. Pattern matching in PowerShell is accomplished using the switch statement.

Using your traditional imperative programming languages the switch statement is considered a code smell. The reason that they have been frowned upon it is simply because they usually display signs of code duplication.

Pattern matching is encouraged in functional programming languages because they are much more powerful and functions are first class citizens.

Luckily for us PowerShell has a strong pattern matching construct in the switch statement. Lets look at some examples:
Describe "Pattern Matching" {
    It "Should use a simple switch" {
        $a = 5
        $result = switch ($a) { 
            1 {"The colour is red."} 
            2 {"The colour is blue."} 
            3 {"The colour is green."} 
            4 {"The colour is yellow."} 
            5 {"The colour is orange."} 
            6 {"The colour is purple."} 
            7 {"The colour is pink."}
            8 {"The colour is brown."} 
            default {"The colour could not be determined."}
        }

        $result.Should.Be("The colour is orange.") 
    }

    It "Should use a wildcard switch" {
        $a = "d14151"

        $result = switch -wildcard ($a) { 
            "a*" {"The colour is red."} 
            "b*" {"The colour is blue."} 
            "c*" {"The colour is green."} 
            "d*" {"The colour is yellow."} 
            "e*" {"The colour is orange."} 
            "f*" {"The colour is purple."} 
            "g*" {"The colour is pink."}
            "h*" {"The colour is brown."} 
            default {"The colour could not be determined."}
        }

        $result.Should.Be("The colour is yellow.") 
    }

    
It "Should use a regex switch" {
        $a = "r14151"

        $result = switch -regex ($a) { 
            "[a-d]" {"The colour is red."} 
            "[e-g]" {"The colour is blue."} 
            "[h-k]" {"The colour is green."} 
            "[l-o]" {"The colour is yellow."} 
            "[p-s]" {"The colour is orange."} 
            "[t-v]" {"The colour is purple."} 
            "[w-y]" {"The colour is pink."}
            "[z]" {"The colour is brown."} 
            default {"The colour could not be determined."}
        }

        $result.Should.Be("The colour is orange.") 
    }
}
As we can see this is quite powerful as the switch statement returns an expression.

Closures

A closure (also lexical closure or function closure) is a function together with a referencing environment for the non-local variables of that function.

Closures in PowerShell are created with the function GetNewClosure.

The new script block is closed over the local variables in the scope that the closure is defined in. In other words, the current values of the local variables are captured and enclosed inside the script block bound to the module.

You have seen this in the previous examples.

Hopefully this has been an interesting journey through some of the functional programming concepts and how they can be applied to PowerShell. I will try to build a module that makes it easier to use some of theses constructs.



1 comment: