Building systems using microservices requires us to think more deeply about failure isolation and testing. *TLA+ *is a formal specification language that can be useful in both these scenarios. For failure isolation, TLA+ can be used to identify invariants in your system that can be monitored directly. An invariant can be the ratio of number of requests to one service to the number of requests to a second service, for example. Any change in this ratio would lead to an alert. TLA+ is also being used to identify subtle design flaws in distributed systems. Amazon, for example, used model-checking based on a formal specification written in TLA+ to identify subtle bugs in Dynamo DB before it was released to the public. For most systems, the investment required to create the formal specification and then perform model checking is probably too great; however, for critical systems - complex ones, or those with many users - we think it’s very valuable to have another tool in our toolbox.