January 14th, 2016
Mesos is a top-level Apache project that aims to be the central resource manager of a datacenter. This is a lofty goal, and for those of us that have more experience with another popular resource manager, YARN, who may come from a more “big data”-centric background, the use case for Mesos might not be readily visible. The difference comes in how one views their datacenter.
More than MapReduce
Mesos is described on its homepage (http://mesos.apache.org) as a ‘distributed systems kernel’, with Hadoop and Spark only a small footnote in the description. As opposed to YARN, which is tightly integrated into the Hadoop ecosystem, Mesos supports many different ‘frameworks’; essentially software that is built to utilize the resource manager. As Hadoop matures, and more big data technologies come to the forefront (Spark, Flink, Velox), our requirements for a datacenter are evolving from a place to simply run MapReduce jobs to an infrastructure able to handle the various needs of its users. These requirements usually fit better with Mesos, which supercedes what can be offered by YARN, both in the type of technology as well as the resource scheduling that is offered.
One recent project that has seen Mesos grow in the eyes of its users is Myriad (https://github.com/apache/incubator-myriad), an incubating project that aims to have YARN and Mesos work together as cluster managers. The footnote here is that a cluster can really only have one cluster manager: Mesos scales a YARN cluster. The advantage to this is that the stack of technologies that Mesos supports is a superset of the ones that YARN supports, so we get the best of both worlds.
A question that one might ask is ‘is transitioning my cluster to Mesos worth it?’ Even disregarding the frameworks available, Mesos has a host of features that make it more appealing than YARN: better resource scheduling algorithms, more fine-grained control over resources, and ability to isolate containers. At this point, I can’t say that Myriad is production-ready based on my experiences with it. Documentation on the official page is sparse, there aren’t many questions on StackOverflow related to issues with it, and some errors I faced were indecipherable. Mesos is definitely something to keep an eye on for the future, but YARN integration via Myriad currently leaves a lot to be desired.