Why computer science matters for software developers


I used to think computer science never mattered because I rarely used algorithms and never saw the value of algorithm-based interviews (I still don’t ;) ). The few folks I asked also concurred so I felt I was right.

September 2016

My team got reorged and our goal was to build and deliver a brand new SaaS offering. As part of that move, I made a conscious decision to switch to full stack engineering + dev ops. My experiences have made me re-evaluate my initial stance and realize I was wrong.

Computer Science does matter, a lot! Having that foundation empowers you to

  • Make better tradeoff decisions
  • Innovate new ways of solving problems
  • Spot design pitfalls early; e.g. a design that violates CAP theorem is a disaster waiting to happen.
  • Avoid solving impossible problems e.g. trying to parse HTML with regex.

The following paragraphs show how computer science principles underpin common concepts.

1. File systems and the HTML DOM

What do hierarchical file systems have in common with the HTML DOM? Simple, both are based on trees. Some common operations on trees include reparenting, search and walks. The table below shows how file system and DOM operations can be tied back to basic tree operations.

Tree Operation File system DOM
Search File search Search
Traversal Directory listing Layout rendering
Node reparenting File move Hiding and showing sections of the DOM

Having this foundation allows you to ask deeper questions. For example, let’s analyze the reason behind the discouragement of excessive DOM manipulations.

It’s not the DOM – tree operations are fast. The challenge lies in repainting the entire HTML layout when deep-lying nodes change. If you keep moving nodes around, the browser has to play catch up and this adds up over time.

The ‘best practice’ is to detach the node, manipulate it and then re-attach it. This approach means the browser repaints twice – on attach and detach. The detached node can change several times without triggering DOM reflow since it’s no longer attached.

2. Solving scheduling problems

I wrote a timetable generator in PHP as an intern 8 years ago. Yes, it was a brute force solver and took about an hour to generate a timetable for an average-sized school.

A quick inefficient solution can get you to the market fast and might even work well at small scale. However, any attempt to extend or improve such solutions would be prohibitive. My brute force solution of 2009 would have broken for larger problem sets; a trivial ask such as introducing a new constraint e.g. multiple teachers for multiple classes, would have necessitated a rewrite.

The timetable problem is a constraint satisfaction problem (CSP). Other popular CSP problems include appointment scheduling, map colouring and the 8 queens puzzle. Backtracking search is a standard way to solve CSPs. Solvers can also leverage greedy search, minimum-conflicts heuristics and simulated annealing.

This approach separates the problem from the solver; thus it becomes easy to change the constraints and extend the solver to new scenarios.

3. Avoiding stack overflows

How would you make sure your recursive function never runs out of stack space? This might not be a problem if the language optimizes tail calls but alas most languages don’t.

  1. You could catch the stack overflow exception but that doesn’t mean your computer can’t calculate the value. It just means the recursive implementation exceeded the computer’s stack memory limit.
  2. You could convert the recursive function to a loop. This would require passing in values around and might not be so elegant.

A trampoline solves the problem beautifully and can be reused for all recursive functions and across languages. Read this post to learn how a trampolined factorial function allows you to compute the factorial of 30000.

4. Consistently parsing HTML with Regex?

A common joke is that experienced programmers send newbies on a wild goose chase by asking for a regex parser for HTML. It’s been long since I did any automata course but the short answer goes thus:

  • Regular expressions are a regular grammar (Type 3)
  • HTML is a context-free grammar (Type 2)

Type-2 grammars encompass Type-3 and are thus more complex. See the Chomsky hierarchy.

It might be safe to do this for a small set or to extract data out of pages. But saying this is something that’ll work all the time is not true.

Tip: Know the class of impossible computing problems and save yourself from wasting time on futile challenges.

Conclusion

Concepts like design patterns, computer networks, architecture, etc. all matter in the software engineering profession. However, having a solid CS background is key.

There are loads of opportunities to apply computer science principles to daily tasks – you just have to make the push.

Thoughts?

Related

Advertisements

What every programmer should know about types I


What is a type?

Type: (noun) a category of people or things having common characteristics.

A type represents the range of values of a particular type. Let’s take an example from the mathematical concept of sets in number theory. The set of integers can be seen as a type – only values such as 1, 2, 3 are integers; decimals (e.g. 1.1) or irrational numbers (e.g. π) aren’t members of the integer set. Extrapolating to common programming types, we can create more examples:

  • Integers can only be 1,2,3…
  • Boolean types can only be true or false
  • Floating point / real numbers can represent decimals (with some loss of accuracy for irrational numbers)
  • Strings are strings.

Buckets

Think about a bucket – it can hold various things (water, sand or even fruits). What you can do with a bucket depends on two things:

  • What the bucket contains
  • The rules determining what the bucket can hold.

Lets look at two bucket usage philosophies.

Strict bucket land

The standing rule in strict bucket-land is to have specialized buckets for distinct content types. If you try to use the same bucket for apples and oranges, you’ll get a good talking to.

One advantage is that you immediately know what you are getting once you read a bucket’s label; however, if you need to fetch a single apple and a single orange, you’ll need two buckets; I know this sounds ridiculous but rules are rules.

Loose bucket land

Everything is relaxed in loose bucket-land. Everything – apples, oranges, some sand, ice cream, water etc.) – goes into the same bucket. Unlike the strict folks, no one is going to fret as long as you don’t disrupt their daily lives.

If you need an orange, you dip into the bucket, pull ‘something’ out and then check if you really got an orange. The value is tied to what you actually pull out of the bucket and not the bucket’s label.

Some loose bucketers try to imitate the strict bucketers by using explicitly-labeled buckets . Because there is no hard rule preventing mixes; a trickster can drop an apple into your orange bucket!

Static vs Dynamic Types

The metaphors in the scenario explained above follows:

  • The bucket represents variables (they are containers after all)
  • The bucket contents (e.g. apples, oranges, sand etc.) are the values
  • The rules are the programming language rules

The big schism between dynamically typed and statically typed languages revolves around how they view variables and values.

In static languages; a variable has a type and this restricts the values you can put into it and the operations you can do with it. To draw on the bucket analogy – you typically won’t (and can’t) pour sand into the fruits bucket.

Dynamic language variables have no type – they are like the loose buckets described earlier. Because the containers are ‘loose’, valid actions for a variable depend on its content. To give an analogy – you wouldn’t want to eat out of a sand bucket.

  1. Static systems – variables have a type. So a container may only hold ints, floats or doubles etc.
  2. Dynamic type systems – variables can hold anything. However the values (contents of the variables) have a type.

That is why you can’t assign a string to an int in C#/Java because the variable container is of type int and only allows ints. However, in JavaScript, you can put anything in a variable. The typeof operator checks the type of the value in the variable and not the variable itself.

Why does this matter?

The adherents of the static typing methodology always argue that if it ‘compiles’ then it should work. Dynamically typed language adherents would quickly poke holes in this and tout good testing because that guarantees code works as expected.

In a very strictly-typed language, it would be theoretically impossible to write code that would have runtime bugs! Why? Because typically bugs are invalid states and the type checks would make it impossible to represent such states programmatically. Surprised? Check out Haskell, F#, Idris (yes it is a programming language) or Agda.

The strictness vs testing spectrum

How much testing is required given a language’s type system?

I would view this as a spectrum – to the left would be the extremely loose languages (JavaScript) whereas the right end would contain the strictest languages (Idris). Languages like Java and C# would fall around the middle of this spectrum – they are not strict enough as is evident by loads of runtime bugs.

As you move from the left to the right, then the amount of testing you need to validate the correctness of your program should reduce. Why? The type system will provide the checks for you on your behalf.

Conclusion

I hope this post clarifies some thoughts about type systems and testing for you. Here are some other posts that are related:

  1. Programming Language Type Systems I
  2. Programming Language Type Systems II

A simple explanation of 1s complement arithmetic


I remember taking the digital systems course in my second year of university and being exposed to concepts like k-maps, 1s and 2s complement arithmetic. These building blocks served as the foundations for bigger concepts like adders, half-adders, comparators, gates and what not.

It was pretty much fun doing the calculations then even though I didn’t fully realize why I needed to add a 1 while doing 1s complement arithmetic. Until I stumbled upon the excellent explanation in Charles Petzold’s Code; a great book that uses very lucid metaphors for explaining computing concepts. As is usually the case – the best explanations are typically so simple and straightforward that anyone can grasp them.

Even if you already know about 1s and 2s complement arithmetic; still go ahead and read this, you might find something interesting.

Subtraction – the basics

Assuming you need to find the difference between two numbers, e.g. 173 and 41; this is pretty straightforward, you do so

minuend 174
subtrahend 041
difference 133

Aside: Minuend and subtrahend are valid names, the names of parameters for mathematical operations are given below.

Expression

First number (i.e. 5)

Second number (i.e. 3)

5 + 3

Augend

Addend

5 – 3

Subtrahend

Minuend

5 * 3

Multiplier

Multiplicand

5 / 3

Dividend

Divisor

This is the simple case, how about the carry scenario when you need to ‘borrow’ from preceding digits?

minuend 135
subtrahend 049
difference 086

Aha the pesky borrowing! What if there was a way to avoid borrowing? The first thing to think of is the ceiling of all 3-digit numbers i.e. the smallest possible number that would require no borrows for any 3-digit number. We use 3-digits since we are taking a 3-digit example; were the subtrahend to be a 5-digit number, we would need the smallest 5-digit number value too.

That smallest ‘no-borrows-required’ number is 999. Unsurprisingly, it is the maximum possible value in base ten if you have only 3 digits to use in the hundreds, tens and units positions. Note: In other bases, the values would also be the penultimate value e.g. for base 8, it’ll be 777.

Now, let’s use this maximum value as the minuend

minuend 999
subtrahend 049
difference 950

Since we are using 999 as the reference value, then 49 and 950 are complements of each other; i.e. both add up to give 999. So we can say 49 is the 9s complement of 950 just as 950 is the 9s complement of 49.

Awesome! We can now avoid the annoying carries!! But knowing this is useless in itself unless we can find some way to leverage this new-found knowledge. Are there math tricks to use?  Turns out this is very possible and straightforward.

Math-fu?

Lets do some more maths tricks (remember all those crafty calculus dy/dx tricks)…

135 – 49 = 135 – 49 + 1000 – 1000

= 135 + 1000 – 49 – 1000

= 135 + 1 + 999 – 49 – 1000

= 135 + 1 + (999 – 49) – 1000

= 136 + 950 – 1000

= 1086 – 1000

= 86

QED!

What’s the use of such a long process?

We just found a very long way to avoid carries while doing subtraction. However, there is no motivation to use this since it is quite inefficient in base 10. So what’s the reason for all this?

It turns out that in computer-land, counting is done in 0s and 1s. The folks there can’t even imagine there are numbers other than 1 and 0. As you’ll see, there are some great advantages to using the 1s complement approach in this area.

Lets take a simple example e.g. 11 – 7

minuend 1011
subtrahend 0111
difference ????

Applying the same trick again (this time the minuend will be 1111 instead of 9999).

minuend 1111
subtrahend 0111
difference 1000

Do you notice a pattern between the subtrahend (0111) and the difference (1000)? The complements seem to be ‘inverses’ of each other.

The 1s complement of any numeric binary value is just the bitwise inverse of the bits in the original value. Calculation is just a matter of flipping each bit’s value, a linear  O(n) operation that can be quite fast. That’s a BIG WIN.

Continuing the process again with the addition step this time:

Augend (difference from step above) 01000
Addend (of 1) 00001
Addend (of original 11 value) 01011
Sum 10100

Finally the subtraction of the next 1* number that is larger which will be 10000 (since we used 1111).

subtrahend 10100
minuend 10000
difference 00100

And there is the 4! answer done!

How about negative numbers? Simple, just do the same process and invert the answers.

Hope you found this fascinating and enjoyed it. Let me know your thoughts in the comments.

Understanding and using Streams in JavaScript


Introduction

What do you think of the following code snippet?

NaturalNumbers()
.filter(function (n) { return n % 2 === 0; })
.pick(100)
.sum();

Isn’t it beautifully succinct and neat? It reads just like English! That’s the power of streams.

Streams are just like lists but offer more capabilities because they simultaneously abstract data and computation.

Streams vs Lists/Arrays?

Let’s take a scenario from Mathematics, how would you model the infinite set of natural numbers? A list? An Array? Or a Stream?

Even with infinite storage and time, lists and arrays do not work well enough for this scenario. Why? Assuming the largest possible integer an array can hold is x, then you’ve obviously missed out on x + 1. Lists, although not constrained by initialization, need to have every value defined before insertion.

Don’t get me wrong, lists and arrays are valid for a whole slew of scenarios. However, in this situation, their abstraction model comes up short. And when abstractions do not perfectly match problem models, flaws emerge.

Once again, the constraints of this problem:

  • The size of the problem set might be infinite and is not defined at initialization time  (eliminates arrays).
  • Elements of the set might not be defined at insertion time (eliminates lists).

Streams, which combine data and computation, provide a better abstraction for such infinite problem sets. Their ability to model infinite lists stems from lazy evaluation – values are only evaluated when they are needed. This can lead to significant memory and performance boosts.

The set of natural numbers starts from 1 and every subsequent number adds 1 to its predecessor (sounds recursive eh? ). So a stream that stores the current value and keeps adding one to it can model this set.

Note: As might have become obvious: extra data structures might be needed to store previously generated stream values. Streams typically only hold a current value and a generator for calculating the next value.

What is a Stream?

I published stream-js, a very small (4.1kb minified) library that provides stream processing capabilities. Grab it or read the source as the post builds on it.

Oh, do contribute to the repo too!

How do I create a stream?

The Stream constructor expects an initial value and a generator function, these two values form the stream head and tail respectively.

An empty stream has null head and tail values. In infinite streams, the tail generator will endlessly generate successive values.

var emptyStream = new Stream(null, null);

var streamOf1 = new Stream(1, null);

var streamOf2 = new Stream(1, function () {
    return new Stream(2, null);
});

var streamOf3 = Stream.create(1,2,3);

var streamFromArray = Stream.fromArray([1,2,3]);

Note: The fromArray method uses the apply pattern to partially apply the input array to the arguments function above.

Show me the code!

Now that you know how to create Streams, how about a very basic example showing operations on Streams vs Arrays in JS?

With Arrays

var arr = [1,2,3];
var sum = arr.reduce(function(a, b) {
    return a + b;
});
console.log(sum);
//6

With Streams

var s = Stream.create(1,2,3);
var sum = s.reduce(function(a, b) {
    return a + b;
});
console.log(sum);
//6

The power of streams

The power of streams lies in their ability to hold model infinite sequences with well-defined repetition patterns.

The tail generator will always return a new stream with a head value set to the next value in the sequence and a tail generator that calculates the next value in the progression.

Finite Streams

The Stream.create offers an easy way to create streams but what if this was to be done manually? It’ll look like this:

var streamOf3 = new Stream (1, function() {
    return new Stream(2, function() {
        return new Stream(3, function () {
            return new Stream(null, null);
        });
    });
});

Infinite Streams

Infinite Ones

Let’s take a dummy scenario again – generating an infinite series of ones (can be 2s too or even 2352s). How can Streams help? First the head should definitely be 1, so we have:

var ones = new Stream(1, ...);

Next, what should tail do? Since it’s a never-ending sequence of ones, we know that tail should generate functions that look like the one below:

var ones = new Stream(1, function() {
    return new Stream (1, function() {
        ...
    };
});

Have you noticed that the inner Stream definition looks like the Ones function itself? How about having Ones use itself as the tail generator? Afterall head would always be one and tail would also continue the scheme.

var Ones = function () {
    return new Stream(1, /* HEAD */
        Ones /* REST GENERATOR */);
};

Natural Numbers

Let’s take this one step further. If we can generate infinite ones, can’t we generate the set of Natural numbers too? The recurring pattern for natural numbers is that elements are larger than their preceding siblings by just 1.

Let’s define the problem constraints and add checkboxes whenever a stream can be used.

  • Set is infinite ☑
  • Set has a well-defined recurring pattern ☑
  • Definition needs an infinite set of ones ☑

 

So can streams be used to represent natural numbers? Yes, stream capabilities match the problem requirements. How do we go about it?

The set of natural numbers can be described as the union of the set {1} and the set of all numbers obtained by adding ones to elements of the set of natural numbers. Yeah, that sounds absurd but let’s walk through it.

Starting from {1}, 1 + 1 = 2 and {1} ∪ {2} = {1,2}. Now, repeating the recursion gives rise to {1, 2} ∪ {2, 3}  = {1,2,3}. Can you see that this repeats indefinitely? Converting to code:

function NaturalNumbers() {
    return new Stream(1, function () {
        return Stream.add(
            Stream.NaturalNumbers(),
            Stream.Ones()
        );
    });
};

Execution walkthrough

The first call to NaturalNumbers.head() returns 1. The tail function is given below:

function () {
    return Stream.add(
        Stream.NaturalNumbers(),
        Stream.Ones()
    );
}
  • Stream.NaturalNumbers is now a stream that has a head of 1 and a tail generator that points to itself. Think of the sets {1} and Natural numbers.
  • Stream.Ones is a stream with a head of one and a tail generator of ones.

Once invoked, this will give a new stream with a head of 1 + 1 and a new tail function that will generate the next number.

Building upon natural numbers

Generating the sets of even and odd numbers is a cinch – just filter the set of natural numbers!

var evenNaturals = NaturalNumbers().filter(function(val) {
    return val % 2 === 0;
});

var oddNaturals = NaturalNumbers().filter(function(val) {
    return val % 2 === 1;
});

Pretty simple right?

Who needs infinite sets?

Computers are constrained by storage and time limits so it’s not possible to ‘have’ infinite lists in memory. Typically only sections are needed at any time.

stream-js allows you to do that

  • Stream.pick: allows you to pick elements of a stream.
  • toArray: converts a stream to an array

A typical workflow with stream-js would involve converting an input array to a stream, processing and then converting back to an array.

For example, here is the array of the first 100 odd numbers; you need a 1000? Just pick them (pun intended).

var first100odds = oddNaturals.pick(100).toArray();

Note: Stream operations can be chained since most stream operations return new streams (i.e. are closed operations). Here is odo, v0.5.0 of stream-js.  Odo means river in Yoruba, the language of my tribe.

And that’s about it! I hope you enjoyed this, now read how to write a promise/A+ compatible library next.

Why JavaScript ‘seems’ to get addition wrong


Introduction

JavaScript is a dynamic weakly-typed language so it’s possible to have expressions like this:

var foo = "string" + 22 * 3 - 4;

This post explains how JavaScript evaluates such complex ‘mix-n-matches’ and at the end of this, you should know why foo is NaN.

First, a screenshot showing more funny behaviour:

Addition and Subtraction
Addition and Subtraction

A brief Maths Refresher

Associativity

The result of the mathematical operation is always same regardless of the ‘consumption’ order of the operands during the operation. Associativity deals with the operators and is important in resolving situations that involve an operand between two operators. In the examples below, there is always a number between the two mathematical operators. Associativity rules remove the ambiguity that might arise in these situations.

Addition and multiplication are associative operations.

(1 + 2) + 3  = 1 + (2 + 3);
(1 * 2) * 3  = 1 * (2 * 3);

Side Note: Mathematical operations on floating point values (IEEE 794) suffer from rounding errors and can give funny results.

Non-associativity

Order matters, opposite of associativity. Operations could be left-associative or right-associative.

5 - 3 - 2 = (5 - 3) - 2; //left associativity
var a = b = 7; // a = (b = 7); //right associativity

Commutativity

The result of the mathematical operation is always the same regardless of the position of the operands. Commutativity, as opposed to associativity, focuses more on the operands – if swapping the place of  operands does not affect the result then it is commutative. As again, addition and multiplication are commutative (and associative as well) while division and subtraction are not.

1 + 2 = 2 + 1; //commutative

3 * 5 = 5 * 3; //commutative

1 - 2 != 2 - 1; //not commutative

Mathematics and Programming: The Interesting Divide

Operators can be overloaded in Mathematics and programming and in both cases the input values (i.e. operands) determine the right operation. For example, the multiplication symbol X can either signify pure arithmetic multiplication if both values are numbers or a vector cross-product if both inputs are vectors or even scalar vector multiplication. Similarly in programming, the + operator is usually overloaded to mean both addition and string concatenation, depending on context and usage.

Overloading has constraints; for example, the expression 1 + “boy” is invalid (and quite absurd) in the mathematics realm; operands have to be members of well-defined sets in other to get meaningful results.

Operators in strongly-typed programming languages, like their Mathematical counterparts, only allow operations on compatible types. Programmers have to explicitly coerce types to expected values if they want to mix and mash.

Weakly-typed languages offer no such restrictions, rather they attempt to automatically deduce the programmer’s intent and coerce values based on some heuristics. As expected, surprises occur when the language’s interpretation differs from the programmer’s intentions.

For example, consider the expression 1 + “2” in a weakly-typed programming language, this is ambiguous since there are two possible interpretations based on the operand types (int, string) and (int int).

  • User intends to concatenate two strings, result: “12”
  • User intends to add two numbers, result: 3

The only way out of the conundrum is the use of operator precedence and associativity rules – these determine the result.

How JavaScript adds numbers

Steps in the addition algorithm

  • Coerce operands to primitive values

The JavaScript primitives are string, number, null, undefined and boolean (Symbol is coming soon in ES6). Any other value is an object (e.g. arrays, functions and objects). The coercion process for converting objects into primitive values is described thus:

* If a primitive value is returned when object.valueOf() is invoked, then return this value, otherwise continue

* If a primitive value is returned when object.toString() is invoked, then return this value, otherwise continue

* Throw a TypeError

Note: For date values, the order is to invoke toString before valueOf.

  • If any operand value is a string, then do a string concatenation
  • Otherwise, convert both operands to their numeric value and then add these values

The case for the unary + operator

The unary + operator is quite different – it forcibly casts its single operand to a number.


//Cast to number

+"3";

//Convert to string

"" + 3;

The first case uses the unary operator which will convert the string to a number while the second case casts to a string by passing a string as one of the operands to the addition operator.

But what about the – operator?

Subtraction is great because it is not overloaded to signify other operations; when used, the intent is always to subtract the RHS from the LHS. Thus, both operands are converted to numbers and then subtraction is carried out on the numeric values. And this is why you can use the – operator to cast values too.

Trying to subtract a string of characters from another string of characters is undefined and you’ll always get a NaN.


"3" - "";

; 3

//Relying on implicit conversion in - operator

Examples

The table of coercions

First, a table showing the generated values from coercion operations. This makes it very easy to deduce the result of mix-n-mash expressions.

Primitive Value String value Numeric value
null “null” 0
undefined “undefined” NaN
true “true” 1
false “false” 0
123 “123” 123
[] “” 0
{} “[object Object]” NaN

Examples – The fun starts

Some examples, try to see if you can explain the results. Believe me, this is a fun fun ride. Enjoy!

1 + 2;

Output: 3
Why?: Addition of two numbers

'1' + 2;

Output: ’12’
Why?: Addition of a number and a string – both operands are converted to strings and concatenated.

2 - 1;

Output: 1
Why?: Subtraction of two numbers

'2' - 1;

Output: 1
Why?: Subtraction of a number from a string – both operands are converted into numeric values

2 - '1a';

Output: NaN
Why?: Subtraction of a string from a number – conversion of ‘1a’ into a number value gives NaN and any Maths op involving a NaN gives a NaN.

2 + null;

Output: 2
Why?: Addition of a number and the null primitive, numeric value of null primitive is 0 (see table of coercions). 2 + 0 is 2.

2 + undefined;

Output: NaN
Why?: Addition of a number and the undefined primitive – numeric value of undefined primitive is NaN (see table of coercions) and operations involving a NaN give a NaN.

2 + true;

Output: 3
Why?: Addition of a number and the true primitive – numeric value of true primitive is 1 (see table of coercions). 2 + 1 = 3.

2 + false;

Output: 2
Why?: Addition of a number and the false primitive – numeric value of the false primitive is 0 (see table of coercions). 2 + 0 = 2.

Fun with objects

The preceding part covered mostly primitives (with the exception of strings), now on to the big objects; pun intended.

First objects

2 + {};

Output: 2[object Object]
Why?: {}.toValue returns {} (which is not a primitive) so {}.toString() is invoked and this returns the string ‘[object Object]’. String concatenation occurs.

{} + 2;

Output: 2
Why?: This one is quite tricky I admit. JavaScript sees the {} as an empty execution block, so technically the above sample is equivalent to + 2 which is 2.

var a = {};
a + 2;

Output: [object Object]2
Why?: The assignment removes the ambiguity – JavaScript knows for sure it is an object literal. The rules of conversion follow as earlier described.

Arrays next!

2 + [];

Output: “2”
Why?: [].toValue returns the array (which is not a primitive) hence [].toString() is invoked and this returns the empty string. The operation is now 2 + “” and this results in string concatenation.

[] + 2;

Output: “2”
Why?: Same as above

Associativity and Evaluation

JavaScript + operator is left-associative, this means operands are evaluated from left to right when they occur more than once in a series. Thus 1 + 2 + 3 in JavaScript (being left-associative) will be evaluated as (1 + 2) + 3 and so on. You can read more here.

Now to the samples again!

1 + 2 + "3";

Output: “33”
Why?: left-associativity ensures this is (1 + 2) + “3”, which goes to 3 + “3”, giving 33

1 + "2" + 3;

Output: “123”
Why?: This will be evaluated as (1 + “2”) + 3, and then “12” + 3

"1" + 2 + 3;

Output: “Left as an exercise ;)”.
Why?: Share your answer in the comments.

Conclusion

This post was actually motivated by Gary Bernhardt’s very popular WAT talk, at this stage I hope you have gained the following:

  • Ability to fearlessly refactor JavaScript code that is lacking parentheses or has no clear operator/operand ordering.
  • A deeper understanding of how JavaScript evaluates expressions and operations on primitives and object types

Do let me know your thoughts in the comments!

Related Posts

How to implement the Y-combinator in JavaScript


This post provides a very simple step-by-step implementation of the Y-combinator in JavaScript. You should be able to implement the Y-combinator in your language of choice after reading this post; as you’ll see – it’s that easy.

What is a combinator?

According to wikipedia,

A combinator is a particular type of higher-order function that may be used in defining functions without using variables. The combinators may be combined to direct values to their correct places in the expression without ever naming them as variables.

The emphasized text highlights the most interesting part of the definition – combinators allow functions to be defined without variables. Imperative programming relies heavily on variables and trying to eschew variables can be a mind-stretching exercise.

Show me the code!

The following code snippet is a Y-combinator example of the factorial function in JavaScript.

var yCombFact = function (number) {
    return (function (fn) {
        return fn(fn, number);
    })(function(f, n) {
        if(n <= 1) {
            return 1;
        } else {
            return n * f(f, (n - 1));
        }
    });
};

yCombFact(5);
//120

Looks abstruse right? No worries – lets break it down.

Two major things

There are two major concepts that help drive home the understanding of the Y-combinator:

  1. No variables allowed – this implies we have to find a way to write the factorial function without using variables.
  2. Invoking the no-variable-function defined in 1 without using variables again.

Part 1: Rewriting the factorial function without variables

Here is the factorial function definition. Can you spot the variable usage in it?

var factorial = function(n) {
    if(n <= 1) {
        return 1;
    } else {
        return n * factorial(n - 1);
    }
}

factorial(5);
//120

The expression n * factorial(n – 1) only succeeds because it can find the variable factorial in scope; without it, the factorial function would not be recursive. But remember, we are trying to do away with all variables.

The workaround is to pass in the variable reference as a function parameter. In the factorial example, recursion can then be achieved by using the placeholder parameter as the reference. The no-variable-factorial function looks like the following:

var noVarFactorial = function(fn, n) {
    if(n <= 1) {
        return 1;
    } else {
        return n * fn(fn, (n - 1));
    }
}

noVarFactorial(noVarFactorial, 5);
//120

The new definition works exactly like the old one but without the internal dependency on the factorial variable. Rather, recursion succeeds by relying on the ‘injected’  parameter and the computer is none the wiser.

Part 2: Invoking the no-variable function without variables

We have rewritten the factorial function to avoid variables however we still need to store the factorial function in a variable before invoking it


var factorial = ...;

factorial(factorial, 5);

The trick? Functions to the rescue again! Let’s create a factorialCalculator function that uses the noVariableFactorial definition above.

function factorialCalculator(n) {
    //as defined earlier
    var noVarFactorial = ...;
    return noVarFactorial(noVarFactorial, n);
}

factorialCalculator(5);
//120

The noVarFactorial name has to go since we want to avoid variables. And how do we achieve this? Yes, functions once more. So lets create a wrapper function inside the factorialCalculator that invokes noVariableFactorial.

function factorialCalculator(n) {
    var wrapper = function (noVarFact) {
        return noVarFact(noVarFact, n);
    }
    return wrapper(noVarFactorial);
}

factorialCalculator(5);
//120

Unfortunately, the wrapper function has led created another wrapper variable and this has to be eliminated too. For a complete implementation, the two variables (wrapper and noVarFact) have to go.

It’s now time to leverage language specific idioms to achieve this. JavaScript has the IIFE idiom which allows you to immediately invoke a function (read about it here). Using it, we can eliminate the need for the wrapper variable as thus:


function factorialCalculator(n) {
    return (function (noVarFact) {
        return noVarFact(noVarFact, n);
    })(noVarFactorial);
}

factorialCalculator(5);
//120

Combining all the pieces

The last thing is to insert the noVarFact definition so that it is no longer a global variable in the scope. Just as we do in Mathematics, we can just ‘substitute’ the value in. The final piece is then:

function factorialCalculator(n) {
    return (function (noVarFact) {
        return noVarFact(noVarFact, n);
    })(function(fn, n) {
        if(n <= 1) {
            return 1;
        } else {
            return n * fn(fn, (n - 1));
        }
    });
}

factorialCalculator(5);
//120

And that, my friends, is the yCombinator in all its glory. I have decide to leave the variables as they are to make it all clear but here is the standard format so you know it when you see it


function factorialCalculator(n) {
    return (function (fn) {
        return fn(fn, n);
    })(function(fn, n) {
        if(n <= 1) {
            return 1;
        } else {
            return n * fn(fn, (n - 1));
        }
    });
}
factorialCalculator(5);

Conclusion

The Y-combinator is quite easy to understand – it requires understanding function invocation patterns, variable substitution by parameters and higher level functions. As an exercise, can you try implementing the fibonacci using the Y-combinator approach? Better still, why not create a Y-combinator function that accepts function that match the fn(fn, n) signature?

Related Posts

SICP Section 3.3 – 3.5 : Found a bug in memq


1. Is memq broken?

memq is an in-built list search function; it finds the first occurrence of a key in a list and returns a new list starting from that key.


(memq 3 (list 1 2 3 4))
//; '(3 4)

(memq 5 (list 1 2 3 4))
//; #f

Now that you know what memq does, lets look at some weird behaviour

(define x (list 1 2))
(define a (list 3 4))

//; append x to the list
(set! a (append a x))
(memq x a)
//; #f -> x is not in a

Building on that foundation leads to the following conundrum

(define x '(1 2 3))

//; Create a cycle: last element of x is itself
(set-cdr! x x)

//; is x in x?
(memq x x)

//; never-ending loop

memq tests whether the key exists in the list and if it does, then it returns the list starting from the first occurrence of the key. But what is the first occurrence in a cyclic list? Could this be the bug?

2. Algorithms, algorithms

  • Horner’s algorithm

This is useful for calculating polynomial values at discrete points; for example, given a polynomial function, f(x) = 7x³ + 4x² + 4; what is f(3)? A potential application (and possible interview question too) is to convert string values into numbers – 1234 is the value of x³ + 2x² + 3x + 4 when x is 10.


//assuming polynomial is represented from lowest power to highest

//i.e. 1234 -> [4, 3, 2, 1]

function horner (poly, base) {
    if(base === 0) {
        return 0;
    }

    var val = 0;
    var polyLen = poly.length;
    for(var i = 0; i < polyLen; i++ ) {
        val += poly[i] * Math.pow(base, i);
    }
    return val;
}

horner([4,3,2,1], 10);
//1234

  • Fast exponentiation

Because going twice as fast is more fun than going fast.


function exponent (base, power) {
    var val = 1;
    while(power > 0) {
        val = val * base;
        power = power - 1;
    }
    return val;
}

Now, lets look at fast exponentiation.

function fastExponent(base, power) {
    if(power === 1) {
        return base;
    }

    //handle odd powers
    if((power % 2) === 1) {
       return base * fastExponent(base, (power - 1));
    }

    var part = fastExponent(base, (power / 2));
    return part * part; //square of part also works
}

fastExponent(10,3)
//1000

Fast exponentiation grows logarithmically Ο(log N) while the normal one is Ο(N). This same concept can be reapplied to similar scenarios.

3. Streams

Functional programming offers many advantages but one potential downside is performance and needless calculation. For example, while imperative programming offers quick exit constructs (e.g. break, continue); functional programming constructs like filter, map and reduce have no such corollary – the entire list has to be processed even if only the first few items are needed.

Streams offer an elegant solution to this issue by performing only just-in-time computations. Data is lazily evaluated and this makes it possible to easily (and beautifully) represent infinite lists. Inshaaha Allaah I should explain this concept in an upcoming post. It’s very beautiful and elegant and powerful.

Related Posts on my SICP adventures

  1. SICP Review: Sections 3.1 & 3.2
  2. SICP Section 2.5

5 things I have gained from SICP: Section 1.2


Alhamdulillaah I finally completed section 1.2 of the SICP classic. I was amazed I took almost 150 days to complete 61 pages. I definitely need to sit up! Reading the text was not challenging, the bottleneck was ‘forcing’ getting myself to finish the exercises.

I was going to start on section 1.3 but I felt it would be better to reflect on the knowledge gained first.

1. A deeper understanding of procedures and their processes

Not all procedures are created equal – seemingly recursive procedure can be iterative in execution. Moreover, there is nearly always a way to make a recursive procedure execute in an iterative manner. This can lead to speed gains and also help you handle HUGE data.

Programming is deeper than I thought…

2. Mathematics

My Mathematical forays are usually related to Computer Science or Machine learning however I got to dabble into new fields. First was the Ackermann function; a fascinating recursive function which grows with mind-boggling speed – you have to think hard to grasp this. Next came primality testing and probabilistic approaches to prime number verification.

The Fermat test is good enough however it is fooled by the Carmichael numbers. To be sure, use the Miller-Rabin test.

I do not know how much of these will be useful in real-life but yeah… it is good to know.

3. Algorithms – exponentiation

The rapid exponentiation algorithm was an eye-opener – do stuff twice as fast :). Once you implement the ‘speed-up’ and ‘slow-down’ functions and handle all cases properly, you can take down exponentiation from a O(n) operation to a O(log n) operation. Combining this with a ‘tweaked’ recursive-but-iterative-in-execution procedure leads to ‘tales’ of joy…

It was also interesting to see how minor changes to code could wipe out performance gains made from clever algorithms. Exercise 1.26  showed how easy it was to lose the O(log n) gains from exponentiation by doing irrelevant work. A subtle ‘refactoring’ might have huge implications…

5. Perseverance

I initially find most exercises daunting and struggle to understand the task. I force myself to think about the problem for at most 15 minutes; if I still do not get it, then I allow myself to look at available solutions.

Alhamdulillaah I usually figure out the solution during the time window and then look up existing solutions to see other problem-solving approaches. I have also looked up solutions when I got stuck too – I was seeking ‘inspiration‘ :). It’s great to know we can solve most problems if we only persevere insha Allaah.

And that’s about it! Section 1.3 has about 18 exercises; since I typically solve an exercise in about 2 – 3 days (I have a 25 minute daily study schedule), I hope to be done with this section in about 4 – 6 weeks insha Allaah. Watch out for a new update then insha Allaah.

Here are my solutions on Github.

Here are a couple of my more academic-themed musings

1. Research is hard!

2. Wrangling with HUGE data

3. MOOC Review: Machine Learning

Understanding Partial Application and Currying


Partial application and currying are two popular concepts from functional programming and help cut code replication in certain scenarios. This article assumes an understanding of the JavaScript bind function; binding is related to partial application in JavaScript.

Partial Application

In pure functional languages, functions are not ‘invoked’, rather a set of arguments is ‘applied’ to them. Now, if all the arguments are not passed in, then the function is being  ‘partially‘ applied.

Partial application converts variadic functions (i.e. functions with multiple parameters) into functions of lesser arity by pre-specifying certain argument values.

Lets take the exponential function (x^y) from the mathematics domain as an example; an exponent value of 2 gives the square function while a value of 3 gives the cube function. The base exponential function has one of its inputs fixed to a certain value.

Let us take a simple example with sums to see how partial application can lead to new functions.

function add(a,b) {
    return a + b;
}

//partial application
function incrementBy(n) {
    return add.bind(null,n);
}

var add2 = incrementBy(2);
add2(2);
//4

var add4 = incrementBy(4);
add4(10);
//14

Currying

No, not Indian/Pakistani Curry (they taste awesome by the way… :) ). The name Curry comes from Haskell Curry, a renowned mathematician who has a lot of things named after him (Haskell is another one) .

Currying is somewhat similar – it turns a variadic function into a sequence of chained monadic (i.e. single argument) functions. For example, a curried triadic function will be invoked as a chain of 3 monadic functions.

function add(a,b) {
    return a + b;
}

//currying
function curriedAdd(a) {
    return function(b) {
        return add(a,b);
    };
}

curriedAdd(3)(4);
//7

var add3 = curriedAdd(3);
add3(7);
//10

 

That was a simple example, let’s look at a more generic curry function.

function curry (fn)
{
 var arity = fn.length,
 exec = function (args){
   args = args || [];
   return function helper(){
       var helperArgs =
            [].slice.apply(arguments);
       var argsTally =
            args.concat(helperArgs);
       if(argsTally.length &amp;amp;gt;= arity){
          return fn.apply(null,argsTally);
       }else{
          return exec(argsTally);
       }
    };
 };

 return exec();
}

function sum (x,y,z) {
    return x * y * z;
};

curriedProduct = curry(sum);

curriedProduct(1)(2)(3);
curriedProduct(1, 2)(3);
curriedProduct(1, 2, 3);
curriedProduct(1)(2, 3);
//All return 6

 

How does this work?!

Let’s dive in and take it apart! Fun!!

1. The length property of the function being curried is its arity (number of expected parameters).

2. The exec function allows currying until all parameters are matched, its args array grows during repeated invocations.

3. The helper function is returned on curry calls. Its helperArgs array allows capturing new parameters until the original function’s arguments are all matched.

If the number of parameters in helperArgs equals the original function’s arity, then all parameters have been passed in and we can execute the function successfully. Otherwise, returning the exec function with the updated arguments allows currying to continue.

This process continues batching up arguments until the curried function’s arity is matched or exceeded.

4. Note that extra parameters are dropped when functions are called with excessive arguments. The curry tastes good with extra input ;)

Let’s trace it out!

A curried function’s execution is traced out below:

curriedProduct = curry(sum);
[arity:3]
-> exec()
-> helper() 
//Gets returned to curriedProduct
curriedProduct(1,2)(3) 
-> helper(1,2) 
[argsTally: [1,2], argsTally.length(2) < arity(3)]
-> exec([1,2])
-> helper() 
[args:[1,2]; available to helper through closure]
-> helper(3) 
[argsTally:[1,2,3]; argsTally.length(3) >= arity(3)]
-> sum.apply(null, [1,2,3])
-> 6

Note: Technically, currying allows only monadic functions…

Differences

While it does seem currying and partial application are the same thing but for two subtle differences:

1. Currying will only invoke the original function when all parameters have been passed in while partial application will invoke the base function regardless of this.

2. Curried functions will always return new functions until all arguments have been applied whereas partially applied functions will always return the same base function with some pre-determined argument values. This becomes quite obvious in functions with an arity > 2.

Why should I care about this?

1. Allows you to think in a new way about programming.

2. Both concepts allow you to create new helper functions and utilities that are useful in a lot of situations.

3. Functional programming is cool.

4. At least you now (hopefully) know the difference between partial application and currying!

Conclusion

What do you think about both functional programming concepts? Add your comments and do check out other interesting posts.

1. Understanding Tail Recursion

2. Programming Language Type Systems II

3. Programming Language Type Systems I

4. Quirky Quirky JavaScript: Episode One

Understanding Tail Recursion


A tail call is the last call executed in a function; if a tail call leads to the parent function being invoked again, then the function is tail recursive.

function bar() {};
function baz() {};
function biz() {};

function foo() {
    bar();
    return baz();
}

function foo2() {
    if(bar()){
        return baz();
    } else {
        return biz();
    }
}

In the first function – foo; the baz() function is a tail call while in foo2, the baz and biz are tail calls because both are the last calls to get executed in the function. The second example shows that the tail call doesn’t have to be the last line of the function – it just has to be the last call the function makes before it returns.

Deep Dive

Lets analyze the factorial function.

function fact(n) {
    if(n <= 1) {
        return 1;
    } else {
    	return n * fact(n-1);
    }
}

fact(5);
//120

fact(30000);
//RangeError: Maximum call stack
//size exceeded

A fact(3) call is traced out below:

fact(3) -> 3 * fact(2)
3 * (2 * fact(1))
3 * (2 * 1)
3 * 2
6

Every recursive call has to be totally evaluated before the final value can be calculated.

Now let’s make a tail-recursive version of the same function.

function fact2(n) {
    function tailFact(n, acc) {
        if(n <= 1) { return acc; }
        return tailFact(n-1, n * acc);
    }

    return tailFact(n, 1);
}

Evaluation of this function now looks like this:

fact2(3) -> tailFact(3, 1)
tailFact((3-1), (3*1))
tailFact(2, 3)
tailFact((2 -1), (2*3))
tailFact(1, 6)
6

The tail recursive function does not use the stack; the function calculates the factorial as it goes along! The last recursive call performs a simple calculation based on the input arguments and returns.

Thus, there is no need to push frames on to the stack because the new function  already has everything it needs. As such stack overflows would not occur with tail call optimized functions.

One neat little thing to know is that the tail-recursive calls do not change the original values; in the factorial function example above, the value of n is never changed, rather new n values are passed in to new function calls.

Optimizing tail calls
A few languages (e.g. scheme) will optimize tail calls so that they run as loops however other languages e.g. JavaScript, Python do not; even if a function is in the tail call form; it’ll still run out of stack space.

The trick to get a non-optimizing compiler to simulate tail recursion involves the use of a trampoline; lets see examples below

//tail-call version runs out of space

fact2(30000);
//RangeError: Maximum call stack
//size exceeded

function trampoline(fn) {
    var result = fn;
    while (result instanceof Function) {
        result = result();
    }
    return result;
}

//Has subtle error
function fact3(n){
    function tailFact(n, acc){
        if(n <= 1) { return acc;}
        return tailFact(n-1, n*acc);
    }

    return trampoline(tailFact(n,1));
}

fact3(30000);
//RangeError: Maximum call stack
//size exceeded

The fact3 function trampoline call fails because we pass in a function expression and not a function reference. The execution steps are traced out below:

fact3(2)
trampoline(tailFact(2,1))
trampoline(tailFact(1, 2))
trampoline(2);

The trampoline is only called after the tailFact completes its recursive operations. We need to make two changes:

1. Return a function reference from the tailFact function.

2. Pass in a function reference to the trampoline function.

The Function.prototype.bind function is perfectly suited for this purpose and the transformed functions are

//Tail recursive
function fact4(n){
    function tailFact(n, acc){
        if(n <= 1) { return acc;}
        //return function reference
        return
          tailFact.bind(null,n-1,n*acc);
    }

    //update to function reference
    return trampoline(
       tailFact.bind(null,n,1)
    );
}

fact4(30000);
//Infinity

Benefits

The biggest advantage of using tail calls is that they allow you to do extensive operations without exceeding the call stack. This makes it possible to do a lot of work in constant space without running into out of memory exceptions; this happens because the frame for the currently executing function is re-used by the newly-executed function call.

It is possible to write the same function using tail call recursion and this will allow you to do it for arbitrarily large values. The space optimization is from O(n) to O(1) since every new function call reuses existing frames rather than pushing stuff on the stack.

Another advantage is that tail calls are essential for the continuation passing style (also called fluent programming; a popular example is jQuery function chains). Imagine chaining a huge set of calls together without needing to worry about running out of stack space – after all you need a location to store the executing functions.

The disadvantage of tail-call optimized code however is that it is difficult to debug; how would you know what function call lead to what function call if the call stack frames are being reused?

Conclusion

Do share your thoughts about tail recursion in the comments or check out these other exciting posts!

1. JavaScript’s Function.prototype.bind

2. Quirky Quirky JavaScript: Episode One

3. The Immediately Invoked Function Expression

4. Functional JavaScript – Tail Call Optimization and Trampolines

Addendum

I spent about a week trying to figure out why my original trampoline didn’t work until I ran into Don Taylor’s excellent article; it sure did help! Can you figure out why it’s wrong too?


function trampoline(fn) {
    var res = fn();
    while (res && res instanceof Function) {
        res = res();
    }
    return res;
}