What every programmer should know about types I


What is a type?

Type: (noun) a category of people or things having common characteristics.

A type represents the range of values of a particular type. Let’s take an example from the mathematical concept of sets in number theory. The set of integers can be seen as a type – only values such as 1, 2, 3 are integers; decimals (e.g. 1.1) or irrational numbers (e.g. π) aren’t members of the integer set. Extrapolating to common programming types, we can create more examples:

  • Integers can only be 1,2,3…
  • Boolean types can only be true or false
  • Floating point / real numbers can represent decimals (with some loss of accuracy for irrational numbers)
  • Strings are strings.

Buckets

Think about a bucket – it can hold various things (water, sand or even fruits). What you can do with a bucket depends on two things:

  • What the bucket contains
  • The rules determining what the bucket can hold.

Lets look at two bucket usage philosophies.

Strict bucket land

The standing rule in strict bucket-land is to have specialized buckets for distinct content types. If you try to use the same bucket for apples and oranges, you’ll get a good talking to.

One advantage is that you immediately know what you are getting once you read a bucket’s label; however, if you need to fetch a single apple and a single orange, you’ll need two buckets; I know this sounds ridiculous but rules are rules.

Loose bucket land

Everything is relaxed in loose bucket-land. Everything – apples, oranges, some sand, ice cream, water etc.) – goes into the same bucket. Unlike the strict folks, no one is going to fret as long as you don’t disrupt their daily lives.

If you need an orange, you dip into the bucket, pull ‘something’ out and then check if you really got an orange. The value is tied to what you actually pull out of the bucket and not the bucket’s label.

Some loose bucketers try to imitate the strict bucketers by using explicitly-labeled buckets . Because there is no hard rule preventing mixes; a trickster can drop an apple into your orange bucket!

Static vs Dynamic Types

The metaphors in the scenario explained above follows:

  • The bucket represents variables (they are containers after all)
  • The bucket contents (e.g. apples, oranges, sand etc.) are the values
  • The rules are the programming language rules

The big schism between dynamically typed and statically typed languages revolves around how they view variables and values.

In static languages; a variable has a type and this restricts the values you can put into it and the operations you can do with it. To draw on the bucket analogy – you typically won’t (and can’t) pour sand into the fruits bucket.

Dynamic language variables have no type – they are like the loose buckets described earlier. Because the containers are ‘loose’, valid actions for a variable depend on its content. To give an analogy – you wouldn’t want to eat out of a sand bucket.

  1. Static systems – variables have a type. So a container may only hold ints, floats or doubles etc.
  2. Dynamic type systems – variables can hold anything. However the values (contents of the variables) have a type.

That is why you can’t assign a string to an int in C#/Java because the variable container is of type int and only allows ints. However, in JavaScript, you can put anything in a variable. The typeof operator checks the type of the value in the variable and not the variable itself.

Why does this matter?

The adherents of the static typing methodology always argue that if it ‘compiles’ then it should work. Dynamically typed language adherents would quickly poke holes in this and tout good testing because that guarantees code works as expected.

In a very strictly-typed language, it would be theoretically impossible to write code that would have runtime bugs! Why? Because typically bugs are invalid states and the type checks would make it impossible to represent such states programmatically. Surprised? Check out Haskell, F#, Idris (yes it is a programming language) or Agda.

The strictness vs testing spectrum

How much testing is required given a language’s type system?

I would view this as a spectrum – to the left would be the extremely loose languages (JavaScript) whereas the right end would contain the strictest languages (Idris). Languages like Java and C# would fall around the middle of this spectrum – they are not strict enough as is evident by loads of runtime bugs.

As you move from the left to the right, then the amount of testing you need to validate the correctness of your program should reduce. Why? The type system will provide the checks for you on your behalf.

Conclusion

I hope this post clarifies some thoughts about type systems and testing for you. Here are some other posts that are related:

  1. Programming Language Type Systems I
  2. Programming Language Type Systems II
Advertisements

5 thoughts on “What every programmer should know about types I

  1. Quite nice overall.
    Here are a couple of things for your bucket list:
    “Strings are strings”: If a string has one end, it has another end.
    (From a Murphy’s Laws collection)
    A string is a linear structure of characters
    – not particularly useful for shoe laces.
    Strictly typed – no run time bugs?
    What prevents you from dividing by zero?

    Like

    • Thanks Joseph! Strings can mean different things depending on what you want.

      Yes, divide by zero checks can be prevented with dependent types by specifying inputs so that a division by zero is illegal. For example, the compiler might error saying it’s possible for some inputs to be zero at runtime. A sufficiently ‘strict’ program can be collapsed to a proof.

      Like

  2. One thing to keep in mind is that dynamically typed languages still have types. It’s just that type checking is done at run-time instead of at compile-time, as in static languages. Checking at compile time is of course preferable as bugs can be detected early, without having to run the software.

    To truly get a sense of the real power of strict statically typed languages you should skip C# or Java, which haven’t really got very strict type systems, and examine F#, OCaml or Haskell.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s