Fail loudly, fail fast

vedant agarwala (~vedant479)


Description:

Failing loudly is arguably one of the most counter-intuitive aspects of software engineering. No one wants their program to "fail" i.e. to error out. But this is precisely what a program must do, when it cannot continue to execute in the given situation. In real world systems, system failures and software crashes are not the worst, and sometimes they’re not bad at all. There is something much worse: deadlocks, crashes long after the original bug, data loss and corruption, and data inconsistency. If a part of the system fails or the application crashes right before those worse things happens, then we’re lucky enough. In fail-fast systems, bugs are easier to find and fix, so fewer go into production.

If different components (think classes, modules) of a system are programmed to fail loudly (by raising exceptions, for example), then the entire system on a whole behaves in a more predictable and resilient manner. When a component fails, it is up to the caller to handle the error. Some examples:

  1. ORM code that expects a valid database connection should raise an error if such a connection doesn't exist in the constructor itself; rather than failing when the object is to be saved in the database.
  2. Fail Fast Iterator: An iterator that attempts to raise an error if the sequence of elements processed by the iterator is changed during iteration. It doesn't continue to execute since it might skip or include a deleted element.

Additionally, raising helpful error messages makes the program much more readable. Anyone glancing at the source code will immediately understand which scenarios the piece of code will not handle.

Some things I will cover:

  • prefer dict[key] over dict.get(key)
  • helpful error messages, at the correct place i.e. raise '<Helpful error message>' over return None

It took me a few years to grasp the fail fast paradigm. Hopefully it helps people understand why. After all it is a very counter-intuitive paradigm.

Talk Breakdown

  • 2 min: Intro
  • 5 min: What is worse than crashing
  • 10 min: Personal experiences of not failing fast and loud
  • 3 min: An Example to discuss with the audience
  • 7 min: More benefits of failing fast and loud
  • 3 min: Exceptions to failing fast and loud, wrap up

QnA!

Prerequisites:

Not much is needed. Only basics of software programming- raising errors. modules, etc.

Bonus: understanding of low level design patterns, SOLID principles.

Video URL:

https://youtu.be/tYNbnIUe5ts

Content URLs:

Link to Slides Slides are not complete. I need to add more images, and reduce the visible text. The essence of each slide, however, will remain the same.

Speaker Info:

Engineering Manager at apna.co.

People like to accumulate likes and followers on instagram. For me, its StackOverflow.

I like software, startups, cryptocurrencies. More so, I love talking about them!

Speaker Links:

Blog

StackOverflow profile

Custom views as components [Android DroidJam '18]: YouTube recording, Slides

Section: Others
Type: Talks
Target Audience: Intermediate
Last Updated: