By Derek Tsang, originally for the Knowledge Lab website.
Knowledge Lab granted 1.38 million dollars this week to support 15 data-driven projects about how ideas are formed and circulated. Supported by the John Templeton Foundation, the grants encompass 11 different institutions, will welcome eight new researchers to Knowledge Lab’s Metaknowledge Research Network, and support the work of eight existing members.
Over the course of the next year, researchers will pursue projects spanning everything from computer science to literature, plasma physics, and sociology. What the projects have in common, though, is that they all seek to understand the nature of their respective fields through computational methods.
“Knowledge Lab is about understanding where questions and answers come from, in science and elsewhere,” said James Evans, the lab’s director and a professor at the University of Chicago. “And when you understand that deeply enough, those biases, those heuristics, those rules, when you take those into account, you can generate better hypotheses and technology.”
Evans and Knowledge Lab executive director Eamon Duede sent out a call for proposals months ago through its network members and various domain-specific email lists. They received some 50 proposals before landing on the 15 projects they’ve committed to fund.
The lab evaluated proposals based on their relevance to the goals of the network, their intellectual novelty and plausibility, their potential broader impacts, and the promise of the team, said Duede. “We wanted really creative people in their fields to think creatively in those ways, about our field.”
These grants represent a major step towards Knowledge Lab’s goal of establishing metaknowledge as a new field. “We want to grow this field, so what we get out of the grants is new collaborators, a bigger network, and more people asking and answering these questions,” said Duede.
In addition to existing researchers at the University of Chicago, the University of Washington, the University of Wisconsin, Northwestern, UCLA, and Columbia, the grants rope in researchers from Notre Dame, Carnegie Mellon, Cornell, MIT, Oregon Health and Science University, and the Rehabilitation Institute of Chicago to the Metaknowledge Research Network.
In a year’s time, Knowledge Lab hopes to see at least one paper published from each of its granted projects in addition to the online platforms and other materials several projects have proposed.
Hod Lipson, Igor Labutov: "Automatic curriculum generation from prerequisite concept networks". Cornell
Lipson is the director of the Creative Machines Lab at Cornell, which researches machines that can build other machines; Labutov is one of Lipson’s PhD students. Their project aims to solve what they call the “bottleneck” of recent technical and academic content without textbooks and courses available as introductions. Lipson and Labutov will build on their existing work to create an algorithm to automatically identify gaps in knowledge transfer and propose information to fill such gaps. “This network will map which information forms conceptual ‘prerequisites’ for understanding other information,” they write.
Lynne G. Zucker, Michael R. Darby: "Coevolution of Organizations and Ideas: Knowledge Creation and Transmission". UCLA
Zucker and Darby are UCLA professors -- Zucker in sociology and Darby at the Luskin School of Public Affairs and the Anderson School of Management. Their proposal is a bold new direction for their previous work tracking the birth of new ideas in organizations, and how these ideas spread differently in different types of organizations. Now, they hope to track which factors increase new ideas, with a particular focus on projects that bridge commercial and academic boundaries. “Is creativity contagious?” they ask. “Does it matter for overall creativity whether this process starts at the top or the bottom of the organizational hierarchy?”
Edward Boyden, Adam H. Marblestone: "Beagle: A tool that empowers individuals and teams to organize and share scientific insights". MIT
At MIT Media Lab, Professor Boyden runs the Synthetic Neurobiology research group, where he and Marblestone research (among other things) technology platforms, research strategies, and scientific collaborations. Boyden and Marblestone plan to create a “sharable meta-layer on top of the existing bodies of scientific knowledge” to make explicit the implicit insights that come with scientific literature: what’s interesting, what deserves scrutiny, what’s conventional, and what biases are coming into play. This meta-layer, named Beagle after Charles Darwin’s ship, will form the basis for other extensions that can annotate research and mine existing annotations for metaknowledge.
Rebecca Steorts: "Computationally Scalable Statistical Methods for High-Dimensional Record Linkage". CMU
Steorts is a visiting professor of statistics at Carnegie Mellon University, where she researches methods for recovering underlying structures from degraded data sets in the social sciences. Her project with Knowledge Lab will use Bayesian models to cluster records together to link records and deal with duplicated information. By adding innovative blocking methods for limiting the search space of her algorithms, Steorts anticipates that her work will scale particularly well to large and complex data sets. Steorts plans to make her research available as an open-source software package for researchers in a variety of fields.
Tim Weninger: "Knowledge Hierarchies and Natural Navigation". Notre Dame
After receiving his PhD in computer science from the University of Illinois Urbana-Champaign, Weninger joined the faculty of the Notre Dame College of Engineering in 2013. Weninger researches information networks using machine learning, databases, and information retrieval. Historically, research on hierarchies has focused on human hierarchies like military chains of command and management trees at corporations. Weninger plans to research hierarchies of knowledge instead of individuals by using his existing algorithm for analyzing information networks like Wikipedia. Then, by using data about how users traverse such networks, he hopes to learn how humans “navigate and perceive information in hierarchies” to create a model “capable of predicting the development or conceptualization of future knowledge.”
Cesar Hidalgo: "Mapping the Social and Technological Context of Cultural Production". MIT
Hidalgo’s work in the Macro Connections group of the MIT Media Lab focuses on refining concepts of complexity and evolution to help design industrial policies that encourage citizens’ well-being. His granted project is an attempt to understand the context of knowledge production through a dataset generated using biographical stories from different cultures and time periods. Hidalgo will use this dataset to explore how the introduction of broadcasting technologies like the printing press and radio have changed the types of knowledge being produced, and how diversity plays into that process. He plans to publicize his results using an online platform to visualize his dataset.
Michael Alfaro, Christopher Kelty, Erik Gjesfjeld: "Mode and Tempo in Technological Evolution: innovation, extinction, and the dynamics of technological diversification". UCLA
Professors Alfaro and Kelty and their post-doc Gjesfjeld work at UCLA’s Institute for Society and Genetics, which works to understand the ethical and legal impacts of genetic and genomic research. Alfaro, Kelty, and Gjesfjeld’s granted project aims to answer questions about innovation and extinction. “What happens when technologies do not succeed?” They ask. “Do they simply disappear, or do these failures play a role in the possibility for innovation in other domains?” Borrowing from evolutionary biology, they plan to create a data-set testable by the toolkit biologists use to study evolution above the level of individual species. They also will co-teach an undergraduate course about technological evolution, and intend to employ some of those students to help create their dataset over the summer of 2015.
Stephen David: "Neurotree: Graphing the Evolution of Science Through Mentorship Networks". Oregon Health and Science University
Neuroscience professor Stephen David runs Neurotree, an open-access website that has tracked mentor relationships for over 40,000 neuroscientists over the last eight years. David plans to use his grant to develop tools to curate Neurotree’s database and link it to publication databases. These links will help “explore how mentorship influences the emergence and evolution of ideas, and if this information can help trainees choose mentors,” writes David. The grant will also support the development of the growing Academic Family Tree, which does work similar to Neurotree for other disciplines like music composition and theology.
Christopher Lee, Mihaela van der Schaar, Erin Sanders: "Statistical Thinking: a scalable, online platform for identifying working scientists’ misconceptions and causal factors via concept testing, data mining and randomized trials". UCLA
Professors Lee, Schaar, and Sanders are coming together from three distinct fields: biochemistry, engineering, and microbiology. They’ve created an online platform to distribute Open Response Concept Testing (ORCT), which “measures the specific misconceptions that cause problem-solving errors,” they write. Using their existing relationships, they plan to engage scientists with ORCT as part of online training in biomedical big data analytics. By seeing how scientists engage with this content, they hope to evaluate whether embracing big data is realistic for all scientists, how much it would cost in terms of training, and what factors play into scientists’ misconceptions when attempting to learn about big data.
Richard Jean So, Hoyt Long: "Culture and the Production of Knowledge". University of Chicago
So and Long are humanities professors at the University of Chicago -- So in English, and Long in Japanese Literature. Their existing projects in the metaknowledge research network use computational methods to look at the spread of specific artistic styles and form across different cultures’ literature. The pair plan to use their Knowledge Lab grant to explore how cultural texts like literature and journalism disseminate scientific knowledge, and how those same texts change how science is practiced. So and Long’s project consists of three parts: a corpus of cultural texts from 1880 to the present; a suite of natural language processing and topic modeling tools to parse this corpus; and a volume of essays distilling what they learn from their analysis.
David Blei, James Evans: "Estimating Multidimensional Influence in Science and Scholarship". Columbia and University of Chicago
Evans, the director of Knowledge Lab, and Blei, professor of statistics and computer science at Columbia and a Knowledge Lab board member, want to reexamine Knowledge Lab’s data. They propose to develop statistical models to answer three questions: How do scientists’ interests change over the course of their careers? Which papers have had the most historical influence, as measured by both language and by equations? And which papers are interdisciplinary, and how do they become so? These models, they claim, would have predictive power for modern papers, and would be directly applicable to other forms of public reading, like journalism, social media, and forms of entertainment.
Jevin West, Carl Bergstrom: "Inferring the hierarchical structure of citation networks to improve semantic search of the scholarly literature". University of Washington
Bergstrom, a professor of biology at the University of Washington, was one of the first members of the Metaknowledge Research Network for his research on the economics of scholarly publication. With this grant he joins West, a professor at UW’s Information School. West and Bergstrom’s research use citation networks to make available the type of knowledge typically only available to domain experts -- which studies are foundational, the jargon necessary for study retrieval, and how core concepts in a field relate. Using citation networks alone, they’ve already produced a powerful recommendation engine for scholars, but they intend to expand on their existing work by adding text-based search.
Konrad Kording, Luis Amaral, James Evans: "Optimizing scientific reviewer assignments". Rehabilitation Institute of Chicago, Northwestern, and University of Chicago
Kording, Research Scientist / CI chair at the Rehabilitation Institute of Chicago, joins Knowledge Lab board members Amaral and Evans to take on the problem of peer review, which is at the center of how science is formalized and propagated. “The match between reviewers and manuscripts, however, has rarely been analyzed and has never been optimized to minimize bias and variance,” they write. Kording, Amaral, and Evans have been given exclusive access to a dataset of manuscripts, reviewers, and editor ratings for the over-8000 neuroscience papers submitted to a major journal. Their project will analyze this dataset to quantify bias and variance, and offer solutions to how to optimize the peer review process.
Jacob G. Foster, Carl Bergstrom: "The Theory of Games and Scientific Behavior". UCLA and University of Washington
Foster, a Knowledge Lab board member and a professor of sociology at UCLA, joins Bergstrom in an attempt to develop a game-theoretical model of the social organization of science. Game theory uses mathematics to study decision making by quantifying incentives for and against certain lines of action, and Foster and Bergstrom want to use such a model to understand why individual scientists take the projects they do, and why certain institutions persist instead of others. Their project will help answer “how the structure of scientific institutions and the conventions for scholarly communication influence the type of problems addressed,” they write.
Elizabeth Pontikes, Gary Lupyan, James Evans: "Uncovering Priors in Academic Fields". University of Chicago and University of Wisconsin
Pontikes, a professor at the Booth School of Business, researches how organizations define themselves within markets, and Lupyan, a psychology professor at the University of Wisconsin and an existing member of the metaknowledge research network, studies what makes certain explanations more compelling, and how ideas spread across populations. Their granted project seeks to find the common factors that lead to fractures between different schools of thought through citation networks and surveys to identify the different psychologies of researchers within a field. How much of the difference between schools of thought can be chalked up to cognitive biases, and how much to field-specific content?