Quick estimation tips for engineers


The Millau Viaduct. The Guggenheim Museum Bilbao. The Dubai Palm Islands.

Architectural masterpieces – beautiful art with solid engineering foundations.

Bash. Hadoop. Git.

Great software – beautiful code with solid architectural themes.

Is software development art? Pretty much. Is it engineering? Pretty much as well.

Dams, tunnel-boring machines and turbines are similar to operating systems, high-performance load-balancers and big data systems. Regardless of the end product’s tangibility – good engineering is necessary for delivering great results.

Engineers need to estimate system performance and simulate real-life scenarios. For most engineering fields, there are rich banks of proven theories and mathematical relations to rely upon. Unfortunately, software engineering – the new kid on the block – has a few rigorous rules, most times we rely on heuristics and handed-down wisdom.

The good news is that most of these rules can be successfully tailored to software engineering.

This post examines a few of these rules and shows how to use them.

1. Rule of 72

This is great for estimating exponential growth rate scenarios. It is well known in the financial sector but can be easily applied to algorithmic growth. The rule:

An exponential process with a growth rate will roughly double its value in time if r X t = 72.

The rule has its roots in logarithms (log 2 ~ 0.693). 69 is more accurate however it doesn’t have as many factors. 68 and 70, which are just as close, have the same flaw too. The closest easy-to-factor number is 72 (factors are 1,2,3,4,6,8,9,12,18,24,36 and 72).

Example

An exponential algorithm with a growth rate of 4% (i.e. it takes 1.04 more time to run a problem size of n + 1 compared to n) will have its run time doubled when the problem size increases by a factor of 18. Why? Because 4 * 18 = 72.

What if the problem size increases by 90?

4 * 90 = 360

360 / 72 = 5

Thus, we can expect the run time to double 5 times, a 32-fold (2 ^ 5) increase in run time.

Test: What problem size would cause a 1000-fold increase in runtime? Hint 1000 ~ 1024, (2^ 10).

2. π seconds make a nanocentury

Well, not exactly but close enough for back-of-the-envelope calculations and easy to remember.

π, the ubiquitous constant, is approximately 3.1427 thus π seconds is ~3.1427 seconds. A nano-century is 10-9 * 100 years which is 10-7 years. There are approximately 3.154 x 10seconds in a year and thus about 3.154 seconds in a nano-century.

The 0.3% difference between 3.154 and 3.142 is safe enough for quick estimates. Thus, π can be used for such quick calculations (with some minor accuracy losses).

Example

Let’s build on rules 1 and 2 with a possible real-world scenario.

An exponential program with a growth rate of 72% takes 300 seconds to run on a sample size n; would it be wise to run the program on a problem with n = 1000000?

Using the rule of 72, a 1000-fold increase in leads to a 10x increase in run time.

0.72 * 1000 = 720.

720 / 72 = 10

Doubling the run time 10 times gives a factor of 1024. The 1000-size problem should take about 1024 * 300 seconds ~ 300 000 seconds.

Let’s invoke the π seconds rule next, 300 seconds ~ 100 π seconds which in turn gives about 3.6 days. If a 1000-fold increase will cause a 4-day wait period; you can well imagine what a million-fold increase would lead to. Spending 3 days on a better algorithm might be worth it…

3. Little’s law

Little’s law states that the average number of things in a system L, is equal to the average rate at which things leave (or enter) the system λ, multiplied by the average amount of time things spend in the system, W. i.e. L = λ * W.

Imagine a processing system with a peak capacity of 2000 requests per second. Let’s further assume that is a 5-second processing time for each request. Using Little’s law, the system needs to robustly accommodate 10,000 requests (you better start thinking about memory too).

One fix might be to drop all messages that have spent more than 5 seconds in the system, thereby allowing new requests to come in. Otherwise, the system would need extra processing capability or storage under heavy loads.

A better approach would be to over-engineer and design a system with a higher capacity e.g. one with a 2500 requests per second limit. Thus 2000 requests per second spikes would push the system to  a comfortable 80% of max capacity.

Is over-engineering bad? Doesn’t it go against YAGNI? Well, think of this:  over-engineering is one of the reasons why the 132-year old Brooklyn Bridge is still standing today.

Little’s law is very useful when building web servers, real-time systems and batch processors. Stable complex systems can be built by linking several small systems that obey Little’s law. It has also been applied to Kanban work-in-progress (WIP) limits.

The beauty of Little’s law is in its simplicity and versatility – you can even use it to estimate queue wait times!

Note: The Guava RateLimiter mentions the rule, not sure if it implements it though.

4. Casting out 9s

A quick rule of thumb for verifying results of mathematical operations. Let’s take summation as an example.

Given x + y = z; the sum of all the digits in and modulo 9 must be equal to the sum of all the digits of modulo 9. Difficult? Nope, just cast out 9s as you go along.

Let’s take an addition example:

  1242

+3489

+4731

 _____

 9462

The sum of digits in the sum is (9 + 4 + 6 +2) = 21. 21 modulo 9 is 3. The sum of digits in the addends is (1 + 2 +4 + 2) + (3 + 4 + 8 + 9) + (4 + 7 + 3 + 1) = 9 + 24 + 15 = 48. And 48 modulo 9 is 3. Since both remainders are 3, we can assume the addition is correct.

For faster results, cast out 9s as soon as they are generated. For example, 9462 gives 9 + 4 + 6 + 2 which gives 9 + 1 + 2 = 3.

Subtractions can be verified by casting out 9s in the reverse operation. For example a – b = c turns into a = b + c. The rule also works for multiplication and division too.

P.S: Teach this to your kids…

More and more scenarios

How many books can a 1.44MB floppy disk hold?

Assuming a book has 250 pages, with each page containing 250 words and an average word length of 5 characters. This gives an average of 312500 characters per book.

A character can be stored in a byte; thus a book could be stored in about 0.325MB (~0.31 MebiBytes MiB). 4 such books will easily fit in a 1.44MB floppy disk. What about a 2GB memory stick? That can hold 6100+ books (several lifetimes of reads).

Can I physically move data faster than 100MB/s Ethernet?

How long does it take to transfer 1TeraBytes of data via a 100MB/s Ethernet link channel?

1 Terabyte = 1000000 MegaBytes; thus the Ethernet link channel would take 1000000 / 100 = 10000 seconds for a successful corruption-free transfer; which is about 2.778 hours. If physically moving a 1 TeraByte drive to the new location takes 30 mins; then it makes more sense to do that.

Need to move 10 Terabytes? Maybe a courier would be faster…

Conclusion

This post is to show the power of back-of-the-envelope and how they enable us to make quick accurate estimates.

Related

Casting out nines

Advertisements

13 thoughts on “Quick estimation tips for engineers

  1. This an enlightening post, I’m sure it would come in handy.
    Another this I would like you to write on is standard tools that can be employed to monitor the performance and execution time of critical parts of our software.
    Thanks.

    Like

  2. Nice Post, very interesting.

    In MIT, Engineers are taught a course called Street Fighting Mathematics (there is actually a book called that), the course teaches skills to do quick and dirty estimations.

    Check MIT OCW for more details

    Like

  3. […] Building and distributing the smallest software piece you can imagine requires more effort than you would think. It is much more efficient to break down the big picture into small chunks of work that can be completed in an hour or less. Such breakdowns make you more effective and help in understanding progress and forecasting completion times (which is a tricky problem to solve). […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s