Why computer science matters for software developers


I used to think computer science never mattered because I rarely used algorithms and never saw the value of algorithm-based interviews (I still don’t ūüėČ ). The few folks I asked also concurred so I felt I was right.

September 2016

My team got reorged and our goal was to build and deliver a brand new SaaS offering. As part of that move, I made a conscious decision to switch to full stack engineering + dev ops. My experiences have made me re-evaluate my initial stance and realize I was wrong.

Computer Science does matter, a lot! Having that foundation empowers you to

  • Make better tradeoff decisions
  • Innovate new ways of solving problems
  • Spot design pitfalls early; e.g. a design that violates CAP theorem is a disaster waiting to happen.
  • Avoid solving impossible problems e.g. trying to parse HTML with regex.

The following paragraphs show how computer science principles underpin common concepts.

1. File systems and the HTML DOM

What do hierarchical file systems have in common with the HTML DOM? Simple, both are based on trees. Some common operations on trees include reparenting, search and walks. The table below shows how file system and DOM operations can be tied back to basic tree operations.

Tree Operation File system DOM
Search File search Search
Traversal Directory listing Layout rendering
Node reparenting File move Hiding and showing sections of the DOM

Having this foundation allows you to ask deeper questions. For example, let’s analyze the reason behind the discouragement of excessive DOM manipulations.

It’s not the DOM – tree operations are fast. The challenge lies in repainting the entire HTML layout when deep-lying nodes change. If you keep moving nodes around, the browser has to play catch up and this adds up over time.

The ‘best practice’ is to detach the node, manipulate it and then re-attach it. This approach means the browser repaints twice – on attach and detach. The detached node can change several times without triggering DOM reflow since it’s no longer attached.

2. Solving scheduling problems

I wrote a timetable generator in PHP as an intern 8 years ago. Yes, it was a brute force solver and took about an hour to generate a timetable for an average-sized school.

A quick inefficient solution can get you to the market fast and might even work well at small scale. However, any attempt to extend or improve such solutions would be prohibitive. My brute force solution of 2009 would have broken for larger problem sets; a trivial ask such as introducing a new constraint e.g. multiple teachers for multiple classes, would have necessitated a rewrite.

The timetable problem is a constraint satisfaction problem (CSP). Other popular CSP problems include appointment scheduling, map colouring and the 8 queens puzzle. Backtracking search is a standard way to solve CSPs. Solvers can also leverage greedy search, minimum-conflicts heuristics and simulated annealing.

This approach separates the problem from the solver; thus it becomes easy to change the constraints and extend the solver to new scenarios.

3. Avoiding stack overflows

How would you make sure your recursive function never runs out of stack space? This might not be a problem if the language optimizes tail calls but alas most languages don’t.

  1. You could catch the stack overflow exception but that doesn’t¬†mean your computer can’t calculate the value. It just means the recursive implementation exceeded the computer’s stack memory limit.
  2. You could convert the recursive function to a loop. This would require passing in values around and might not be so elegant.

A trampoline solves the problem beautifully and can be reused for all recursive functions and across languages. Read this post to learn how a trampolined factorial function allows you to compute the factorial of 30000.

4. Consistently parsing HTML with Regex?

A common joke is that experienced programmers send newbies on a wild goose chase by asking for a regex parser for HTML. It’s been long since I did any automata course but the short answer goes thus:

  • Regular expressions are a regular grammar (Type 3)
  • HTML is a¬†context-free¬†grammar (Type 2)

Type-2 grammars encompass Type-3 and are thus more complex. See the Chomsky hierarchy.

It might be safe to do this for a small set or to extract data out of pages. But saying this is something that’ll work all the time is not true.

Tip: Know the class of impossible computing problems and save yourself from wasting time on futile challenges.

Conclusion

Concepts like design patterns, computer networks, architecture, etc. all matter in the software engineering profession. However, having a solid CS background is key.

There are loads of opportunities to apply computer science principles to daily tasks – you just have to make the push.

Thoughts?

Related

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s