When organizations choose a vanilla Hadoop or Spark distribution instead of one of the vendor distributions, they have to decide how they want to provision and manage the cluster. Occasionally, we see "handcranking" of Hadoop clusters using config management tools such as Ansible, Chef and others. Although these tools are great at provisioning immutable infrastructure components, they're not very useful when you have to manage stateful systems and can often lead to significant effort trying to manage and evolve clusters using these tools. We instead recommend using tools such as Ambari to provision and manage your stateful Hadoop or Spark clusters.