Tuesday, August 11, 2009

Error Handling and Exceptions in Java

   Table of Contents


Intended Audience

    There are countless books, articles, blogs, and software forum discussions on the subject of error handling. So, why am I writing another article that stands little chance of even being picked up by Google - in the sea of all other Internet posts and publications? First and foremost, based on the real-world projects I see every day, I strongly believe that error handling (or the lack thereof) in Java applications today is a huge problem that, on a large scale, is not being solved - or even seriously considered. Clumsy, improper, often thoughtless attempts on error-handling cripple thousands of Java projects making them unjustifiably complex and... error-prone! Second, I am all but convinced by now that this sad state of affairs takes place not despite the fact that there are so many books and articles written on the subject but rather due to the fact that the vast majority of those publications are forcefully promoting (unintentionally, of course) bad practices and fundamentally flawed concepts. On the other hand, those who have long figured out the better ways of doing things, keep relatively quiet as if they want to keep the secrets to themselves or, most likely, do not want to stir up the controversy or lose some of their followers. They often simply say: "do it this way", but they won't go to a great length to explain why. They know why, they know it works very well, and it is very simple. So they just make a well-meaning suggestion in a forum, and walk away - leaving it up to the more-than-enthusiastic "old-school" zealots to pour dirt all over their comment, dismissing all or most of it - often without even reading the whole thing.

    Most of today's cutting edge Java frameworks and latest specifications seem to be doing things exactly right. The Spring Framework, or even the EJB 3.x specification are just a couple of the most obvious examples. However, I have not yet seen a book or article written by a well-respected Java authority that would openly and forcefully explain the true essense of error handling, why some approaches are better and safer than others, and why projects like Spring, for example, are chosing their way of handling errors while shying away from one of the most hyped-up features of the Java language. Most publications, in fact, continue to promote the same principles, theories, and stereotypes that have proven to fail so miserably. Many egos are at stake, software engineers generally don't like to admit that they were wrong or that their well-intended solution didn't work very well after all. And I honestly think that it seriously hurts our industry.

    If you have been frustrated or dissatisfied with the way your projects handle errors, if you have witnessed how teams struggle chasing subtle untraceable exceptions that provide no answers, and if you are genuinely looking for a better way of doing things, read on. I hope this article answers many of your questions and helps you get on the right track with your next project.

I welcome comments and critical opinions. I only have one favor to ask of anyone who wishes to dismiss my point of view: please read the whole article first and make sure your arguments are not already addressed. It took a great deal of effort to put together this article. I have really tried to deliver the information in the most comprehensive and logical way - based on hundreds of typical questions and arguments I have heard in the past. One of the frustrating experiences has been hearing or reading people bringing up arguments whose flaws have been, I believe, very meticulously exposed in this article.

    I'd also like to note that I do not intend to convert anyone who is perfectly happy with the way they are doing things now, if they believe that their approaches, however different from mine, work well for them.


Introduction


Popular False Premise

    One fundamental flaw of most solutions to error handling in software is that error handling is approached from the wrong end, to begin with. Here's a typical statement taken from a randomly selected online tutorial on exception handling: "...when you write a method that may fail, you must indicate that it may fail and how it may fail." Please note the italicized text. Although the exact wording may vary, this is, generally, the fundamental premise of the vast majority of books, tutorials, blogs, or online articles on error handling you can find today. This is the axiom, the taken for granted fact that makes the basis for all error-handling methodologies and "best" practices promoted in all those publications. Unfortunately, the very premise that some methods may fail and some may not is just plain wrong!

    Amazingly, it is a commonly accepted and hardly disputed assumption that if a module may(!) produce an error, then an indication of that possibility must be somehow built into the module to let programmers know whether or not they should provide an error-handling strategy for that module. In other words, the common thinking is: "Let's see if the function tells us that an error may occur inside; if yes, we will decide how to handle the error." If you have just read this and did not immediately see the mind-boggling ridiculousness of that assumption, you may be one of the many who have fallen into the trap. It is this philosophy that makes software systems vulnerable by not at all "expecting" large sets of error conditions, and turns error-handling solutions into unmanageable, cumbersome, inefficient, and unreliable behemoths. I will try to do my best to explain.

    It takes a 180-degree change of the view angle to see how simple and elegant error handling may - and must - be.

Always Know: Anything May Fail!

    As I have just pointed out, the common premise is based on the assumption that all functions in a software program may be divided in two categories: the ones that may fail (according to the programmer who designs the operation), and the ones for which the programmer does not expect any failures. That assumption is incorrect and dangerous in its core. If you understand the simple concept of a function or method in a programming language, you will easily see why.

    A subroutine (a.k.a "method", "function", etc.) in a programming language is the most basic form of abstraction. The main purpose of abstraction in software is to hide the implementation details of a particular functional unit, and provide simplicity of usage by exposing only a minimalistic interface via which the clients/callers may effectively invoke the abstracted functionality. Among many benefits, the concept allows to later modify the implementation details without affecting any code that depends on it. The very definition of abstraction implies that no entity that uses the abstracted functionality is in the position to make any assumptions about the implementation details of that unit and may only be aware of the exposed interface. Bingo! (Oh, and before someone even thinks about stating that silly, childish argument... No, it absolutely does not matter if the same programmer writes the client code and the function the code call!) Once something is abstracted, you are not supposed to make any assumptions of the abstracted details! Period.

When it comes to error handling in software, the only safe and correct assumption that may ever be made is that a failure may occur in absolutely every subroutine or module that exists! There are no functions for which a programmer has the right and responsibility to safely assume no possibility of failure. (Again, I don't want to hear any lame demagogy about functions that do nothing, contain no code, or simply return a constant, etc. That does not build up to a valid argument in favor of ridiculously complicating things by distinguishing between methods that may fail and those that may not.)

    Even if the designer of the particular method thinks that the method is very simple and solid, it is not responsible (and not smart) to state or imply that it may never fail. A failure may happen due to a wide variety of reasons. The method may have bugs in it. The implementation of the method, or any underlying libraries it depends on may change in the future. Some low-level code that the method depends on may have subtle bugs or not designed to work as expected by a client - in certain environments. And so on. Of course, there are two different types of failures: system/software errors, and purposefully designed/signaled exceptional conditions whose intent is to reflect different types of legitimate business situations that disallow the successful execution of the "main" workflow. We will talk about these differences later. Some people would argue that the "possible failure" premise we mentioned in the beginning usually implies the latter type ("business" conditions.) It doesn't matter. The only safe and responsible way to program is to always KNOW that any method may potentially fail to perform its main task (if not due to a business condition then due to some other type of error) and design for both successful and negative outcomes. Assuming that some methods may fail and some may not will put you on a wrong track from the start.

Two Groups of Exceptional Conditions

    Since anything may fail, it is irresponsible and incompetent of a programmer to leave some operations out of the error handling strategy simply because they do not explicitly indicate that an error might occur inside. In other words, it makes no sense at all to specify on an individual basis whether a method or component may result in an error or not. The question, therefore, is not WHETHER but WHEN and HOW different types/classes of error conditions should be handled in the system. A properly designed system must ultimately handle ALL errors, one way or the other. Therefore, the application designer faces a fairly simple task of dividing all potential exceptional conditions that may occur within the system into two very basic catecories:

  • the few that actually matter; these are the conditions that must be handled individually due to the fact that the system is expected to (can and must) do something specific in case of such conditions - as defined by the actual needs of the application (requirements), and nothing else; such conditions may be further divided into classes that require (by design!) more specific individual handling; the application designer/programmer may identify and add new error types or exceptional conditions to this group at any stage of development;
  • everything else, i.e. all other exceptional conditions that the system cannot and/or does not care to provide any form of recovery other than, perhaps, logging and graceful termination.

What I Mean by "Handling"?

    For the sake of clarity, I want to note that the term "error handling" in this article means not only "recovering" from them and implementing an alternative action. When I say "handling" I absolutely mean any type of action - by the program or programmer - to process the error, including managed program termination. This means that programmer's bugs must be handled too - in a special way, since you can't guarantee that one of such bugs doesn't make its way into Production, and nothing should simply "blow up" in Production. Regardless of whether your software runs on a space shuttle where the lives of the astronauts are at stake, or you are developing a shopping cart web application where the last thing the end users should be exposed to is some bug in your code.) The answers to the "when" and "how" questions are provided by the system requirements, and nothing else. Period. Of course, it may be fine to allow a little test application to spits an exception stack trace in your face when it blows up. However, every finished professional application that is intended to run in a production environment should always be designed to direct the most complete error logs to a safe logging destination and exit gracefully even in cases of unrecoverable errors.

    The term "handler", as used in this article, refers to a dedicated software module that implements an action to manage a specific class or group of exceptional conditions.

Ultimate Objectives of Error Handling

    When an error occurs, the module must, of course, broadcast that error information somehow. So, the job of proper error handling is really to provide a mechanism that ensures that the error signals (the tokens that carry the information about the error and its origin) are fired and freely propagate from the sources of errors to the strategically placed handlers without affecting the business logic.

Logging of the most complete error information available should be one of the mandatory steps performed by an error handler. Each type of error should be handled (and, therefore, logged) no more than once within the sub-system where it has occurred. Sometimes it is helpful to intercept and log the error at the boundary of the component where the error occurred - before propagating it further to the client. That may be necessary if the component may be potentially deployed on a different server - to ensure that the error logs are available on that server, as well as to the client applications that would do their own logging.

Thoughtful Design and Planning vs. an Afterthought

    The biggest mistake a developer or architect can make with regards to error handling in a software system is treat it as a secondary issue, something that is less important than the "main" functionality, something that may be added at the last moment.

    In a proper design and analysis of any software module or system equal consideration must be given to the desired behavior of the system for both - successful execution and any types of failure. Depending on the system, the latter may be a simple single use case, or a set of quite elaborate use cases whose complexity may measure up to or even outweigh the complexity of the successful work flow.

    Therefore, error handling strategies must be considered an essential part of the system requirements and design. Error handling may never be an afterthought, something that programmers may (or may forget to) add after the "main" work flow is implemented. Regardless of error types (e.g. critical unrecoverable system failures or predictable business conditions treated as errors), the system must be designed to accommodate for graceful and efficient handling of each such class of conditions.

    As I have stated in the Introduction to this article, when approaching error handling in software development, perhaps, the most important thing to keep in mind is the fact that everything can potentially result in an error, and a failure must be always considered as a possible outcome of an operation. This means that any given method call may potentially produce an error of some sort. There are no operations whose successful execution may be unconditionally guaranteed at any time.

    The purpose of error-handling design is not to determine what may fail (since anything may) but to determine how to appropriately divide the complete error set into the sub-sets relevant to the application's desired behavior, and where in the system each such class of failures must be treated. Fortunately, even although errors may occur in thousands of places, a typical application only needs to distinguish between a few types/classes of such conditions. This means that only a few error handling strategies normally need to be implemented per application, and all errors must simply be sorted out and properly directed to the appropriate handler.


Consolidated Error Handling

    Centralized and planned error handling is key to building efficient error-proof software systems. Regardless of the philosophy or technology used, the fact remains that error handling may never be a thoughtless knee-jerk reaction to some alerts automatically generated by a compiler or any other tool. The decision when and how to handle errors - specific or any - must always ultimately be made by the programmer based on a thoughtful consideration of the application requirements and each particular use case.

    Errors may not be reliably handled or controlled in a sporadic decentralized manner. Instead of being sprinkled all over the application, error handling must be consolidated to ensure clarity, reliability, coherence, and prevent a loss of any valuable information about the cause of the error.

Error handling is the job of a few dedicated modules that are designed to treat failures based solely on the application/business requirements.

    In the agile development environment, each new requirement and exceptional case must be analyzed to determine which part of the system should be responsible for handling it. In a case of an exceptional error condition, the call stack must immediately unwind all the way to the dedicated handler that possesses the contextual knowledge and ability to handle the error properly. The error data must reach such handlers in its original form, unaltered, with its complete stack trace. The original error information may be supplemented with additional helpful data or message - but never shortened, replaced, or otherwise altered. This is the purpose of exceptions.
  

Exceptions Fundamentals

    Exceptions in a programming language essentially serve the following purpose. They trigger an immediate unwinding of the call stack (abort the thread execution) and signal the fact and nature of the error - to anyone along the call stack who is willing to listen and take action. This, obviously, provides many significant advantages over the old-fashioned error codes. Methods may throw more than one type of exception, and each such condition may be caught/handled separately and in different methods - without cumbersome conditional logic throughout the call stack. Most importantly, exceptions allow implementing error-handling code only in those modules where it is actually appropriate - instead of doing it in every module in the call stack starting immediately at the origin of the exceptional condition. The latter is inevitable with error codes that must be checked for by each link in the call chain and propagated to the one that actually knows what to do with the error. On the other hand, with exceptions, any methods that can't really contribute to handling the error can remain completely free of any error-handling logic whatsoever while focusing on their business logic instead. All this can lead to much cleaner code and consolidated error-handling functionality.

    When the exception mechanism triggers an exception, the instance of that exception class is broadcast to all the methods on the chain of method calls that eventually resulted in the exception. This means that any of those methods can potentially be coded to catch and handle the exception, if necessary, while others can ignore it. Naturally, there is always one such method in the call chain that is better suited to handle the error than the rest. Which one? The correct answer can only be provided by the application designer, the programmer that is. No tool - a compiler or IDE - can accurately, with 100% certainty determine where in the call stack the given error must be caught and handled. The decision usually depends on the circumstances, such as the application's or use case requirements, etc. So, it is the programmer who has the responsibility to carefully consider all those factors. The error handling constructs must be implemented in the module that contains the sufficient contextual knowledge and ability to take the appropriate corrective action in response to the condition.

    One thing that can be stated with certainty is that in the vast majority of cases, the most appropriate place to handle an error is NOT immediately at the origin of the error. The immediate caller of the method that triggers an exception rarely can and should do anything about the error - simply because it does not have enough contextual knowledge of the circumstances in which it itself was called. So, in the majority of cases, the role of the exception should be to efficiently and transparently carry the error signal through the call stack to the one method that has the intelligence to handle that exceptional condition. 


Java Exceptions

Noble Idea

    In addition to the "traditional" approach to exception handling described in the previous section, Java has introduced a new type of exceptions - checked exceptions. Such exceptions, once thrown by a method, must be defined in the method signature (for unchecked exceptions that is optional), and the compiler forces any immediate caller of the method to take action. To satisfy the compiler, the immediate caller must either catch and handle the exception itself, or to add it to its own signature, which, in turn, subjects each of its own callers to the same restrictions.

    The noble idea behind checked exceptions was to introduce a harness mechanism that would remind/force developers to handle errors before they bubble up to the surface and break the application. This concept may be used with benefits in small tight sub-systems where it is, indeed, necessary to force the immediate caller to handle the error at its origin. However, as noted above, such cases, while valid and existent, make up only a small minority of all real-life situations. So, in a vast majority of cases, for proper handling, an exception absolutely must be propagated further up the call stack - to the only appropriate module that was designated to handle the exception. And this is where checked exceptions introduce more problems than they solve.

    Read the interview with Anders Hejlsberg (the Chief Architect of the C# project) where the subject of checked exceptions is discussed in great detail. In that interview given back in 2003, Hejlsberg explains the most obvious pitfalls of the Java's implementation of checked exceptions, and why the C# committee had unanimously rejected the idea. He specifically talks about such issues as broken encapsulation, inappropriately forced contracts, negative effect on scalability and "versionabiliy" of systems built with checked exceptions.

Myth: Checked Exceptions Add Safety

    In addition to the issues highlighted by Hejlsberg, perhaps, the most essential - and often completely overlooked - is the fact that, in practice, checked exceptions have failed their most hyped promise: increased safety. Actually, the myth of added safety is arguably the most outrageous and dangerous misconception promoted by the advocates of checked exceptions. In fact, if I doubted the sincerity and good intentions of the advocates of checked exceptions, I would call it an outright lie.

Whether the authors of checked exceptions meant this or not, relying on the compiler to indicate which methods can throw exceptions (e.g. fail to do what's expected from it) led so many programmers to disregard the fact that anything may fail under certain circumstances and worry only about a small subset of all potential errors represented by checked exceptions - leaving huge holes in their applications. (I am not theorizing here: it is merely a fact of life, a pathetic reality that can be observed on thousands of Java projects.)

    Many people - perhaps, even subconsciously - interpret the mere presence of checked exceptions in the language as a suggestion that some modules/methods may result in errors while others may not! So, they forget to handle anything other than what the compiler forces them to handle... Need I say more?

Myth: Recoverable vs. Unrecoverable

    Another very common and dangerously misleading guideline - widely promoted by various books, articles, and, most importantly, by Sun Microsystems in their tutorial - is that checked exceptions are meant to indicate "recoverable" conditions, while unchecked exceptions represent only errors (programming errors, i.e. bugs.) Some less experienced (or less competent) programmers even conclude from the latter that since runtime exceptions are "unrecoverable", applications should not even do anything about them! (I only hope those guys are not employed by companies that build software for systems where human lives are at stake!) The "Recoverable vs. Unrecoverable" thesis is also suggested in one of my most favorite and respected books on Java - Joshua Bloch's "Effective Java", which is quite understood considering that the book was written by one of the authors of the JDK. With all fairness, Bloch also stresses that checked exceptions should be used with caution and not overused. He suggests to exercise the best judgement when deciding what should be considered recoverable, and when in doubt, opt for an unchecked exception. And that is the approach I use. However, the only conclusion I usually arrive at is that I am in no position to decide for the client how they would want and need to act upon the specific exception. An API defines how a client accesses the unit of functionality, and what results it should expect. No API can dictate how the client should use the result after the result has been obtained. Therefore, only the caller can and should decide what it wants to recover from, and what it does not care to recover from - based on its own needs and not on the implementation detail of the module it calls. On the other hand, any exceptional condition must be handled by a correctly designed system - at some point. The question is where and when. So, it is also very, very wrong to suggest that runtime exceptions are meant to be ignored.

Myth: Unchecked Exceptions are Bad and Dangerous

    Inevitably, it has become typical to handle checked exceptions and completely disregard unchecked exceptions, only to act surprised when the application suddenly crashes. This, in turn, has generated a phobia of unchecked/run-time exceptions among inexperienced programmers. Those who mistakenly believe that run-time exceptions should not be handled, blame them for application crashes. As the result, a large number of Java programmers today does not even realize that it is their responsibility to design for error handling. A runtime exception is a very useful carrier of error information that is openly available for anyone in the call chain who is interested. Unfortunately, many developers show no interest at all in listening for what's out there for them to use. Instead, they all but thoughtlessly react to a compiler alert (often feeling annoyed) and do something just to make the compiler error go away. Needless to say, such approach cannot lead to reliable error handling.

It is absolutely incorrect to assume that all runtime exceptions should not be caught and allowed to propagate to the very "top" of the application. As I have stressed at the beginning of this article, for every exceptional condition that is required to be handled distinctly - by the system/business requirements - programmers must decide where to catch it and what to do once the condition is caught. This must be done strictly according to the actual needs of the application, not based on a compiler alert. All other errors must be allowed to freely propagate to the topmost handler where they would be logged and a graceful (perhaps, termination) action will be taken. 

Reality: Correctly Handling Checked Exceptions is a Lot of Work

    In large systems, a significant amount of cumbersome, repetitive boilerplate code is necessary to handle checked exceptions properly - to ensure that the error information indeed propagates to the dedicated handler instead of being mishandled. The try/catch constructs and throw clauses litter multiple methods and layers throughout the application. They unjustifiably pollute and bloat methods that have neither knowledge or ability to do anything about the errors they catch and re-throw.

To properly utilize checked exceptions and ensure their propagation to the correct designated handlers in a typical real-world application, a great deal of developer awareness and competence is absolutely required. When checked exceptions are used, even a slightest sloppiness of the programmer inevitably results in chaos, unimaginable code clutter, redundancies and unnecessary complexity. This is at odds with the original intent of checked exceptions, which is to simplify error handling and make it easier for developers of all levels of expertise to know what and when to handle.

    In reality, checked exceptions end up confusing programmers, prompting them to handle or swallow exceptions before they reach the proper handler, and ultimately introducing more problems than they solve.

Business Exceptions

    If there is one thing that both camps (checked vs. unchecked) seemingly agree on is that exceptions should be primarily used to indicate exceptional conditions and should not be used to implement business logic.  This guideline has been widely promoted by most publications, starting with the very popular and well-respected book by Joshua Bloch, "Effective Java".  In reality, however, both camps tend to use exceptions to signal business conditions and direct business logic based on that. Personally, I think that occasionally using an exception to cut through the layers and quickly trigger an alternative business flow is perfectly acceptable, and may be very convenient. However, this should be done with caution and after evaluating all other options.

   For example, here is a very typical use case. The client calls some User service to create a new user entry.  If a new user is successfully created, a new object of some sort is returned.  That may be a user ID, user entity, or some kind of result object that is appropriate according to the use case specification.  Naturally, one well-expected legitimate scenario of this use case would be a situation when a user with the provided credentials already exists.  Many API designers prefer to throw some sort of a UserAlreadyExistsException in such cases, and many argue that it should be a checked exception - in order to explicitly inform the caller of this potential outcome. Any other type of failure to create a new user would indicate a more serious system failure (or even programming error.)  

    Here's my decision-making process when designing APIs for use cases like the one described above. First, I identify any "legitimate" situations in which the service will not create a new user, e.g. "the user already exists", or, perhaps, "the max allowed participants exceeded/enrollment temporarily suspended", etc.  I know that all other conditions would be represented with a generic service-specific runtime exception, e.g. "UserServiceException" (extends RuntimeException) whose only purpose is to signal the service's failure that is not related to a legitimate business situation, and to provide the most complete error info possible for the diagnostic purposes.  Then, I look at the other - legitimate - cases of failure to create a new user and see if my service might have more information than just the binary "yes/no" to return to the client.  If the service, upon its failure to create the user with the given data, has a whole lot of meaningful information, then it may be a good idea to have the API return a result object that the client would examine to see whether a user was created or whether the status of the returned object indicates one of the multiple meaningful conditions.  In our example, "the user exists" condition may only mean one thing: an entry with the given primary key already exists.  So, there is hardly anything else useful that can be returned to the client, and, perhaps throwing an exception is just fine. (Another option might be returning a null vs. a valid newly created object. But the latter may or may not be acceptable depending on the API specifications or specific development guidelines. If the null result may be ambiguous, such solution would not be good. Also, many guidelines discourage returning null for the sake of programming safety.) The bottom line is that throwing an exception in a case of a business condition is only one of several options, and, perhaps, more often than not, it is not the most preferred way. In case the analysis indicates that throwing a business exception is indeed the most convenient solution (and it may be), I would create a separate UserExistsException (that extends my generic  UserServiceException), document it in the API's Javadoc, and place it into the method signature. Since it is a runtime exception, it will not force the very immediate caller method on the client to mess with it but will give the client programmer the freedom to decide where to place the try/catch construct for that particular exception without polluting any methods on the call stack in between.  The client can simply implement a listener for this particular type of exception and not worry about it being a nuisance anywhere else. As I have said, many developers would prefer to make such business exception a checked exception, which is also a valid option, however I would vote for less burden on the client and providing more information via the method signature and Javadoc.  These days, most competent developers would catch a checked business exception on the client the first thing - as soon as they call the API that enforces the contract - and convert it into a runtime exception so it can freely propagate to the dedicated listener elsewhere on the client.

    It is reasonable to expect (and require) of each developer to understand the nature of any API they use. So, if the developer designs an application that has a need to implement the workflows for a successful user creation as well as for the situation when a user already exists, it is a full responsibility of that developer to understand how to use the "createUser" API.

Avoid Using Exceptions to Communicate Business Data

     If an exception does not represent a legitimate business condition, one should never put meaningful business data into the instance of the exception class! Generally, the information inside the exception is for diagnostics purposes only. In most cases, the only "business" information carried by an exception is the very fact of that particular exception being thrown. If you want to indicate a different type of condition, use a different exception class.

    It may be somewhat appropriate in cases of business exceptions to provide a public method or two on the exception class that would allow the clients/handlers to extract some supplementary data. Programmers should use their best judgement to decide what kind of data could be conveyed to the client via an exception, and whether it is appropriate to use that data in the client's business logic. Personally, I believe that in most cases it is not very appropriate or fair to expect the client to build its business logic based on some potential attribute values stored in the instance of the thrown exception class. Exception classes should not contain any "error codes" that are required to be correctly resolved by the client's business logic via all sorts of ugly conditional constructs. This, to a large extent, defeats the purpose of exceptions that is, among other things, to eliminate the necessity in such conditional logic. Unfortunately, this ridiculous pattern of exception misuse is widely spread, and it is even promoted by code examples in many Java books and articles! The worst possible example of bad usage of exception data is when clients are expected to parse the detail message and act upon whatever string tokens are detected.

    Exceptions must not carry any user-oriented messages inside them. That's another common form of exception abuse.  The "detail message" obtained via the Exception.getMessage() method is not meant for the presentation purposes. It is for diagnostics. If you need to present a user-friendly message in the UI in a case of a particular exception, the appropriate message must be resolved on the client side as part of the handling of that particular exception class - by the dedicated handler. It is your presentation tier - and only the presentation tier - that should be aware of the locales and message resources. Therefore, it is absolutely incompetent to place your presentation-tier-specific messages or message constants into the exception classes, especially those originated in other tiers.

    Any mappings between business exceptions and appropriate client actions or presentation details must be implemented on the client itself in a separate dedicated module. For a good example of this concept, I encourage you to take a close look at the Spring MVC's SimpleMappingExceptionResolver class and its usage.

Use Unchecked/Runtime Exceptions

NOTE: You should carefully read the previous chapters in this article before reading this!

    Developers of the modules that have no business of handling a particular error as it passes through do not need to be aware of that error at all. They do not need to bother with implementing any code to handle that error. Instead, only one module in the whole application must be implemented to listen to that particular exception, catch it when it bubbles up, and handle it according to the application requirements. Combined with the mandatory common-sense understanding that anything might potentially fail, runtime exceptions become a super-efficient tool that allows to consolidate error handling in a few strategically placed modules, and with bullet-proof safety.

    Like any technology or tool, unchecked exceptions may be mishandled, true. However, it is very easy to spot - during testing - a case when a particular run-time exception bubbles up way too close to the surface (or crashes the application!) instead of being handled earlier. A detection of such case indicates that a new handler must be introduced at a deeper level (as needed by the system) that will intercept that particular error before the more generic handler. No modifications to any business code would be required. It is significantly easier and less time-consuming than chasing a usually obscure problem caused by a mishandled checked exception.

    While there are cases where a checked exception does no harm and serves good purpose, for the sake of consistency and to avoid error-handling chaos, I recommend using only unchecked exceptions. Unfortunately, the particular implementation of the checked exceptions concept in Java introduces so many possibilities and encouragements for doing things wrong that any benefits of exception-re-enforced contracts between methods (where they may help) pale compared to the overall damage this mechanism causes to the software projects all over the world, and, most importantly, to the very perception of the error handling task and the role of a programmer in it.

    Checked JDK and 3rd-party exceptions could always be caught inside the implementations of the your classes, wrapped into module-specific run-time exceptions (usually, with an additional clarifying message), and re-thrown only to be caught and appropriately handled by the designated handler upstream.

Packaging Exception Classes

    Each exception class should normally reside inside the package with the classes that implement the exact functionality the exception relates to. It is a common mistake and bad practice to use a dedicated exceptions package that combines all exception classes for an application or component. Such approach violates modularity by forcing unnecessary package couplings: all other packages in the application become conjoint at the common exceptions package.

Summary

    I think that one very important but simple point is missing from most discussions on this subject: anything may fail. Period. I just can't stress this enough. If we insist that something in the method signature must remind the programmer about it, then every method's signature must be equipped with such indication, e.g. "throws Exception". The fact that some methods in Java are declared throwing an exception (checked) and some are not is seriously misleading since it implies that some methods may never result in errors, or if they do, the programmers may not need to worry about it. And so many of them don't.

When an error occurs within a function - regardless of whether the function declares a checked exception or not - that should never be a surprise to a developer! Just know that it may happen, and stop looking at the compiler to tell you whether it may or may not. It may. Period. End of story.


    Programmers must never rely on the source of the potential error to tell them whether they should expect an error or not!!! The necessity for error handling must be unconditionally assumed by all programmers at any time. What this means is that the programmer must always KNOW and REMEMBER that any module or operation may potentially - in certain conditions - fail or misfire. Once this simple fact is understood, and the application programmer knows that the module her code depends on may fail, all she has to decide is WHERE in the application (not WHETHER!) she should handle that potential class of errors - without having to do something about the error immediately at its origin. This is what we should be teaching programmers! Instead, we are telling them: "Hey, we have this cool feature in Java that will tell you precisely when you need to worry about errors; in case a method signature declares an exception, you should pay attention..." Are we kidding ourselves? Does anyone see how horribly destructive - and foolish - such approach is? It is one of the most harmful but well-concealed philosophies that has ever been burnt into the minds of millions of programmers. I can't tell you how many times I have heard a developer scream with rage at his application exploding in his face: "What the ...!!! I have taken care of all the exceptions and it still blows up!" What they are saying is that they had caught all the checked exceptions in their code, but didn't even think about any other potential failures. Checked exceptions re-enforce this horrific mindset. Instead of consciously expecting negative results as a given possible outcome of any operation and thoughtfully designing against them, we wait for the compiler to tell us whether we should worry about a possible error or not! That is just ridiculous. The implication of checked exceptions (not intentional, of course) is that if the method does not throw a checked exception, there's nothing to worry about. We know that it is not true because checked exceptions are just a subset of all possible exceptions that may be thrown by the code that implements the operation. And if we know that, we understand that even if we handle the checked exceptions, at some point we must take care of everything else. And if so, we don't need checked exceptions. The problem is that many today's Java programmers raised on checked exceptions are not used to think that way. As the result, they often forget to handle the rest of the exceptions. And you can't really blame them, because - with checked exceptions - the error handling code is everywhere, literally in every single method, i.e. in thousands of places. It is virtually impossible to keep track of what and where is being handled. That is the reason why so many errors are not handled at all, countless errors are mishandled, and so much of valuable error information is lost completely before it reaches the proper place in the system where sense could be made of it, where it could and should be properly handled.


    Fortunately, it's easy to build bullet-proof systems with only unchecked exceptions (much easier than ensuring proper handling of checked ones.) Any application, in reality, only needs to distinguish between a very limited set of exception classes that need to be handled separately by strategically placed dedicated handlers to which these classes of exceptions should propagate without being intercepted. Everything else should be allowed to bubble up to the "ultimate" top-level (the "toppest") handler that catches everything that was not handled before and handles it gracefully. (This means, once such [unchecked] exception is thrown, you don't have to worry about it anywhere in your code, knowing that it will be caught and handled by your top-level handler.) So, you normally start with creating such top-level handler that will catch everything inside the system.

    For example, in a Spring MVC application that would be an exception resolver class configured to catch and handle the most generic Exception class, and, perhaps, resolving the error to the generic error page that politely apologizes to the user and informs the user that the system is currently not able to process the request, etc. Of course, in a case of such web application, there must also be a filter defined in the web.xml file that would ensure that any critical exceptions that may potentially occur in JSP tags on the pages would also be caught and the application redirected to an error page. Such filter is required because any JSP tag exceptions occur outside the Spring-managed controller and will not be dispatched to the resolver registered in the Spring application context.

    Next, we need to identify the classes of conditions that must to be handled individually in a distinct manner. We decide where such events should be handled in our system - based on our system's requirements, and nothing else! For example, in Spring MVC web applications, often a single generic exception resolver class may be all you need to handle all your server-side exceptions! Such exception resolver would resolve all distinct exception classes by implementing the mappings between the exception classes and views (or, in the case of Spring WebFlow, flow states.) Based on the class of the exception thrown anywhere within the application, it will redirect the request to the appropriate view or state - possibly, after executing some logic, if necessary. The resolver would always fall back on the generic "catch-all" redirection, in case the caught exception class is not explicitly mapped to any view or state.

    If, during testing, we discover that some exception propagates too far (e.g. to the topmost handler) instead of being handled earlier, well, we immediately know that another lower-level handler is required (or that the particular exception should be caught by one of the existing lower-level handlers.) That's it. Very simple. And very reliable.