Knowledge manifold: A universal model of collective knowledge production

Literature Review: from network to manifold

The past decade has witnessed the growth of network science, which dramatically changed the landscape of human knowledge (Barabási, 2009). As a general model of data representation, networks have been studied within almost every fields, including ecology, chemistry, engineering, social sciences, to name just a few. In particular, the combination between complex network studies and social sciences was an important driving force underlying the rise of computational social science (Lazer et al., 2009). And when network models met bibliometrics, insights were provided into the science of science (Evans 2010; Wang 2013).

There are two traditions in network modeling. One assumes a non-trivial geometric structure as the background for trivial, local links among uniformly distributed nodes, whereas the other assumes complex linking dynamics on a very trivial geometric background, or even without a geometric background. The following table sorts ten important network models in chronological order, specifying the tradition they are following. The symbols "✗" and "✓" are shown wherever the corresponding ingredient is missing/trivial or non-trivial, respectively.

Model Pattern Geometric background Linking dynamics System
West et al., 1997 densification*1 living organisms
Watts & Strogatz, 1998 small-world*2 ✓*3 social network
Banavar et al., 1999 densification transportation networks
Albert & Barabasi, 1999 scale-free*4 social, technological, and biological networks
Dreyer, 2001 densification living organisms
Song et al., 2010 scale-free and densification human mobility
Papadopoulos et al., 2012 small-world, scale-free, and densification social, technological, and biological networks
Bettencourt 2013 densification urban cities
Zhang et al., 2015 small-world and densification online communities
Zhao et al., 2015 scale-free and densification online and offline human mobility

Annotations: 1. Densification: the super-linear scaling between edges and nodes. Also described as allometric scaling or accelerating growth.2. Small-world: the coexistence of short diameter and high clustering coefficient. *3. The rewiring mechanism seems to be trivial, but there is a non-trivial requirement on the rewiring probability. *4. Scale-free: the power-law distribution of degree.

The above table shows an interesting fact; despite the enormous popularity of the preferential attachment model (Albert & Barabasi, 1999), which did not consider the geometric background of networks, many other models, especially those who tried to explain the "small-world" and densification properties of networks, relied on the assumption of an underlying geometric structure.

This is not surprising, the local clustering structure of networks ("small-world") implies the existence of hidden, geometric constrains on linking dynamics. And the heterogeneity of node and link density ("scale-free" and densification), which seems to be the consequence of complex linking dynamics (Albert and Barabasi, 1999; Zhao et al., 2015), can also be understood as the features of an underlying, nontrivial geometric structure. This underlying geometry can be the "fractal" space-time structure that shapes the growth of living organisms (West et al., 1999) and urban cities (Bettencourt et al., 2013), or the "hyperbolic" latent metric space that constrains the Internet and social connections (Papadopoulos et al., 2012).

Inspired by the success of the geometry tradition of network studies, we propose to model the production of human knowledge, such as the growth of citation networks and open-source software dependency networks, by a random geometric graph on a hyperbolic manifold (a topological space that resembles Euclidean space locally but has a constant negative curvature). We believe the proposed research direction is promising because of serval observations listed as follows:

  1. The benefits of a geometric interpretation of dynamics are well-acknowledged in astrophysics, which is still benefiting from the prediction of gravity wave given by the Lorentzian manifold model of spacetime proposed by Einstein, who greatly generalized the Newtonian dynamics;

  2. Hyperbolic manifold modeling not only explains the emergence of complex dynamics such as the preferential attachment in citation networks, but also works as a machine learning model for data fitting and prediction.

In particular, we can use it to identify the fields of papers and hence investigate the rise and decay of fields, to predict the future citations between papers and thus detect influential studies and scholars. As suggested by a recent study (Thomas et al., 2016), the manifold learning, including multidimensional scaling (Torgerson, 1952) and Isomap (Tenenbaum et al., in 2000), are tightly related with the hyperbolic model of networks (Papadopoulos et al., 2012). This finding further supports our confidence on a hyperbolic manifold model of knowledge systems that is both theoretically interesting and of important applied consequences.

  1. Manifold modeling may provide insight into information diffusion, which has been a hot topic for decades, yet is still struggling for a explanation for the phenomenological models such as logistic diffusion curve (Rogers, 1962) or the Bass model (Bass, 1969).

Recent studies suggested that by recovering the hidden geometry of human mobility from the air-transportation data, the global spread of disease such as SARS or H1N1, which are complex in the conventional view, become simple waves like propagations (Brockmann and Helbing, 2013). This remind us that the diffusion of information, which is often compared to virus diffusion, may also benefit from the construction of a hidden geometric structure. And understanding the mechanism of information diffusion is of critical importance to understanding the production of knowledge.

Data

Web of Science data
Github data
StackExchange data

Research Phases

2016-2017

Knowledge Hyper Graph: Explain the growth dynamics of citation networks and open-software dependency networks using a random geometric graph on hyperbolic manifold.

2017-2018
Knowledge Hyper Map: Recover the hidden hyperbolic manifold underlying citation networks and open-software dependency networks, and use the recovered structure to identify knowledge domains, to predict citations, and to detect influential knowledge pieces and contributors.

2018-2019
Knowledge Relativity: Investigate the principle that leads to the hyperbolic manifold of knowledge. Establish the first principle of knowledge creation, which is comparable to the principle of Lorentz invariance in Einstein's general relativity model. This principle will shed light on the boundary of human collective wisdom.

Impact

你可能感兴趣的:(Knowledge manifold: A universal model of collective knowledge production)