#### Formational bounds of link prediction in collaboration networks

##### Jinseok Kim, Jana Diesner

Link prediction in collaboration networks is often solved by identifying structural properties of existing nodes that are disconnected at one point in time, and that share a link later on. The maximally possible recall rate or upper bound of this approach's success is capped by the proportion of links that are formed among existing nodes embedded in these properties. Consequentially, sustained ties as well as links that involve one or two new network participants are typically not predicted. The purpose of this study is to highlight formational constraints that need to be considered to increase the practical value of link prediction methods for collaboration networks. In this study, we identify the distribution of basic link formation types based on four large-scale, over-time collaboration networks, showing that current link predictors can maximally anticipate around 25% of links that involve at least one prior network member. This implies that for collaboration networks, increasing the accuracy of computational link prediction solutions may not be a reasonable goal when the ratio of collaboration ties that are eligible to the classic link prediction process is low.

arrow_drop_up