When Charles Darwin took his historic voyage aboard the HMS Beagle from 1831 to 1836, "big data" was measured in pages. On his travels, the young naturalist produced at least 20 field notebooks, zoological and geological diaries, a catalogue of the thousands of specimens he brought back and a personal journal that would later be turned into The Voyage of the Beagle. But it took more than two decades for Darwin to process all of that information and into his theory of natural selection and the publication of On the Origin of Species.
While biological data may have since transitioned from analog pages to digital bits, extracting knowledge from data has only become more difficult as datasets have grown larger and larger. To wedge open this bottleneck, the University of Chicago Biological Sciences Division and the Computation Institute launched their very own Beagle -- a 150-teraflop Cray XE6 supercomputer that ranks among the most powerful machines dedicated to biomedical research. Since the Beagle's debut in 2010, over 300 researchers from across the University have run more than 80 projects on the system, yielding over 30 publications.