Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-07T11:01:42.877Z Has data issue: false hasContentIssue false

Nodal Heterogeneity can Induce Ghost Triadic Effects in Relational Event Models

Published online by Cambridge University Press:  01 January 2025

Rūta Juozaitienė*
Affiliation:
Vytautas Magnus University
Ernst C. Wit
Affiliation:
Università della Svizzera italiana
*
Correspondence should be made to Rūta Juozaitien˙e, Vytautas Magnus University, Kaunas, Lithuania. ruta.juozaitiene@vdu.lt
Rights & Permissions [Opens in a new window]

Abstract

Temporal network data is often encoded as time-stamped interaction events between senders and receivers, such as co-authoring scientific articles or communication via email. A number of relational event frameworks have been proposed to address specific issues raised by complex temporal dependencies. These models attempt to quantify how individual behaviour, endogenous and exogenous factors, as well as interactions with other individuals modify the network dynamics over time. It is often of interest to determine whether changes in the network can be attributed to endogenous mechanisms reflecting natural relational tendencies, such as reciprocity or triadic effects. The propensity to form or receive ties can also, at least partially, be related to actor attributes. Nodal heterogeneity in the network is often modelled by including actor-specific or dyadic covariates. However, comprehensively capturing all personality traits is difficult in practice, if not impossible. A failure to account for heterogeneity may confound the substantive effect of key variables of interest. This work shows that failing to account for node level sender and receiver effects can induce ghost triadic effects. We propose a random-effect extension of the relational event model to deal with these problems. We show that it is often effective over more traditional approaches, such as in-degree and out-degree statistics. These results that the violation of the hierarchy principle due to insufficient information about nodal heterogeneity can be resolved by including random effects in the relational event model as a standard.

Type
Theory & Methods
Copyright
Copyright © 2024 The Author(s), under exclusive licence to The Psychometric Society.

Social relationships are shaped by individuals’ interpersonal actions, which play a crucial role in both forming and maintaining these connections (Borgatti & Halgin, Reference Borgatti and Halgin2011). According to Hinde (Reference Hinde1979), relationships can be defined as series of interactions over time. Research on social interactions highlights that exchange participants influence each other’s behaviour (Raush, Reference Raush1965). For instance, when we greet someone, we typically expect and receive a greeting in return. Similarly, in the context of email communication, when we send an inquiry or request, we anticipate and hope for a response, demonstrating the expectation of reciprocity in social exchanges. The formation of social ties is also strongly influenced by homophily, which refers to the tendency of individuals to prefer interacting with others of similar type (McPherson et al., Reference McPherson, Smith-Lovin and Cook2001).

Another commonly observed feature of social interactions is the tendency to form more complex closed structures. In its simplest manifestation these are triads. This mechanism assumes that new connections frequently emerge between people sharing common acquaintances. The results of several independent studies suggest that triadic closure can be identified as one of the fundamental dynamical principles in network formation and evolution (Li et al., Reference Li, Zou, Guan, Gong, Li, Di and Lai2013; Klimek & Thurner, Reference Klimek and Thurner2013; Leskovec et al., Reference Leskovec, Backstrom, Kumar and Tomkins2008). Moreover, this tendency is widely supported on empirical grounds, since it can explain salient features of empirical social networks, including a strong community structure, fat-tailed degree distributions and high clustering coefficients (Foster et al., Reference Foster, Foster, Grassberger and Paczuski2011; Newman & Park, Reference Newman and Park2003; Kumpula et al., Reference Kumpula, Onnela, Saramäki, Kaski and Kertész2007; Bianconi et al., Reference Bianconi, Darst, Iacovacci and Fortunato2014).

Researchers in both personality and social psychology acknowledge that also personality differences influence social relationships (Back, Reference Back2015; Geukes et al., Reference Geukes, Breil, Hutteman, Nestler, Küfner and Back2019). Individuals’ unique personality traits and characteristics can shape how they interact with others. Due to the heterogeneity in their expansiveness, some actors tend to make many connections, while others prefer to stay on their own. Expansiveness represents a person’s degree of sociability and how much they enjoy being in crowds. Expansive people usually have a lower threshold for friendship, and as a consequence, they consider more people as their friends (Olk & Gibbons, Reference Olk and Gibbons2010). A closely related concept, representing the tendency to receive interactions, is popularity. This feature reflects other people’s attitudes towards a particular person. Popular individuals tend to be more often the receiver of relation events. Although both expansiveness and popularity might be functions of other underlying traits, such as genetics or status, respectively, in empirical studies those traits may not be recorded. Thus, social interactions are influenced by a complex interplay of individual characteristics, environmental context, and the history of past interactions.

Relational event modelling (Butts et al., Reference Butts, Lomi, Snijders and Stadtfeld2023) provides a flexible approach to studying the dynamic nature of social relationships. This framework attempts to quantify how individual behaviour, external factors and interaction with other individuals change the social network structure over time. It is often of interest to determine whether changes in the network can be attributed to endogenous mechanisms reflecting natural relational tendencies, such as reciprocity or triadic effects. Nodal heterogeneity in the network is often modelled by including actor-specific or dyadic covariates, such as age, gender, age difference, etc. However, capturing the full extend of all personality traits that encompass popularity or expansiveness is often difficult, if not impossible.

This problem has also been encountered in other areas of network modelling. It has led to the development of random-effects models accounting for the latent and nodal heterogeneity (Thiemichen et al., Reference Thiemichen, Friel, Caimo and Kauermann2016; Box-Steffensmeier et al., Reference Box-Steffensmeier, Campbell, Christenson and Morgan2019, Reference Box-Steffensmeier, Christenson and Morgan2018; Kevork & Kauermann, Reference Kevork and Kauermann2021). In most cases, the individual levels of the random nodal effects are not of interest. However, accounting for additional heterogeneity is important to avoid bias in the estimation of other effects. Thus the inclusion of random effects is an elegant and straightforward way to handle the problem of an increasing number of parameters with an increasing number of actors. And, more importantly, this approach allows us to account for heterogeneity that may have significant implications for statistical network modelling and inference. Alternatively, various endogenous statistics such as nodal in-degree and out-degree can be used to reduce nodal heterogeneity. The interpretation of these effects, however, are rather different. Whereas random effects suggest the existence of unmeasured traits that are responsible for the network dynamics, nodal degree statistics effects imply the existence of emerging viral dynamics in the network.

Exponential random graph models (ERGMs) are a class of statistical models often used for modelling social networks. These models aim to identify features that explain the global structure of a network. A well-known issue in ERGMs is that a failure to account for heterogeneity may confound the substantive effect of key variables of interest. It has been shown (Thiemichen et al., Reference Thiemichen, Friel, Caimo and Kauermann2016) that triadic closure estimates obtained using the model ignoring heterogeneity can vastly overstate the triadic effect present in the network. For example, if specific individuals are more outgoing than others, ERGMs not accounting for heterogeneity may confuse this feature with a network tendency towards a triadic closure (Box-Steffensmeier et al., Reference Box-Steffensmeier, Campbell, Christenson and Morgan2019). These results suggest that the existence of heterogeneity may effect the conclusions drawn from ERGMs fitted on real-world networks.

The problem that omitting individual-level predictors of tie formation can bias parameters of endogenous network parameters, like reciprocity or transitivity, is well-known. Among network modellers, it is generally known as the “hierarchy principle,” that is, include underlying substructures in the model when modelling more complicated dynamics. This is especially true for ERGMs (Lusher et al., Reference Lusher, Koskinen and Robins2013) and SAOMs (Snijders, Reference Snijders2017), but also various applications of relational event models discuss the importance of a proper representation of degree dynamics in order to obtain credible parameter estimates of transitive closure (Snijders et al., Reference Snijders, van de Bunt and Steglich2010; Corbo et al., Reference Corbo, Corrado and Ferriani2016). Usually, it is recommended to model, at least, in-degree and out-degree centralisation using in-stars and out-stars (or geometrically weighted versions of them) to represent degree dynamics that are not well captured by exogenous variables. The advantage of modelling nodal heterogeneity using endogenous star-parameters is that it allows to make conclusions about emergent degree dynamics: It predicts that the network dynamics alone is responsible for emerging patters. Exogenous variables, by their very definition, do not describe emergence, but relate the changes of the network dynamic to some external process. Nevertheless, sociological processes are well-known for their overdispersion, that is, presenting more individual level variability than is possible to model parsimoniously.

In relational event modelling, various other approaches have been proposed to account for the node heterogeneity. Butts (Reference Butts2008) proposed including for each individual a fixed effect defined as a standard indicator function. The corresponding parameters then represent logged rate multipliers for all events having the corresponding individuals as senders or receivers. This can increase the number of parameters dramatically, and it does not distinguish between expansiveness or popularity effects. Other approaches include stochastic blockmodeling, which assumes latent groups of individuals having similar interaction tendencies (DuBois et al., Reference DuBois and Butts2013), or dynamic latent space relational event modeling, which allows individuals’ interactions to depend on dynamic locations in a latent space (Artico & Wit, Reference Artico and Wit2023).

Although it is standard to include individual level random effects in most sociological statistical models, this has only recently been introduced for relational event models (Uzaheta et al., Reference Uzaheta, Amati and Stadtfeld2023). As it is clear that the hierarchy principle is an important modelling concept, we propose adding individual node-level frailty terms as a general modelling strategy. Relational event models focus on behavioural interactions, which are defined as discrete events connecting a sender and a recipient at a specific point in time. In this manuscript we aim to show how node level popularity in terms of sender and receiver effects may mask ghost triadic effects. We propose a frailty model for reciprocal and triadic effects in relational event networks to disentangle them from node-specific effects such as popularity and expansiveness.

1. Relational Event Models with Frailty

The basic idea of relational event models involves modelling the evolution of social interactions as the outcome of a stochastic point process. A general framework capable of exploiting the information contained in sequences of relational events has been introduced in Butts (Reference Butts2008). The model assumes time-stamped network data consisting of sequences E = { e 1 , e 2 , , e T } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E = \{e_1, e_2,\ldots ,e_T\}$$\end{document} of relational events e = ( i , j , t ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e=(i,j,t)$$\end{document} , encoding the event time t, the event sender i and the event receiver j. The events in this series are typically dependent on each other, as relational events often trigger others, such as, e.g., replying to a message or turn-taking in conversations. This interdependence is indeed one of the main interests of network analysis, since it can identify endogenous and exogenous drivers of how people interact (Stadtfeld & Block, Reference Stadtfeld and Block2017).

The relational event model assumes that every interaction process can be encoded by a multivariate counting measure. Following Perry and Wolfe (Reference Perry and Wolfe2013), a counting process for the directed edge between sender i and receiver j is defined as:

N ij ( t ) = # { relational events i j up to time t } . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} N_{ij}(t) = \# \{\text {relational events } i \rightarrow j \text { up to time } t\}. \end{aligned}$$\end{document}

The aim of a relational event model is to capture the heterogeneity in the interaction network as well as complex relational and temporal dependencies among events. A model may include all types of predictors commonly used in social network analysis, which can be divided further into three subsets: (i) temporal network effects (such as reciprocity, transitivity, balance), which we will indicate by the variable s ij ( t ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_{ij}(t)$$\end{document} , (ii) fixed network effects such as attributes of the actors, such as gender, age, etc., as well as dyadic covariates such as age difference or social-economic similarity, which we collectively indicate by x ij \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_{ij}$$\end{document} , and (iii) and random network effects, such as (receiver) popularity and (sender) expansiveness, indicated by z ij \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$z_{ij}$$\end{document} . We define a Cox-type of random effect proportional hazard model for the relational hazard rate,

(1) λ ij ( t | x ij ( t ) , z ij , s ij ( t ) = s ) = λ s 0 ( t ) e θ T x ij ( t ) + b T z ij , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \lambda _{ij}(t|x_{ij}(t),z_{ij},s_{ij}(t)=s)= & {} \lambda _{s0}(t) \textrm{e}^{\varvec{\theta }^Tx_{ij}(t)+\varvec{b}^Tz_{ij}},\end{aligned}$$\end{document}
(2) b N ( 0 , Σ ( ϕ ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} b\sim & {} \mathcal {N}(0,\Sigma (\phi )) \end{aligned}$$\end{document}

where λ s 0 ( t ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{s0}(t)$$\end{document} is a baseline hazard for stratum s, θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{\theta }$$\end{document} are the fixed effects and b \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{b}$$\end{document} is a vector random effects or frailties. The frailty terms are assumed to follow a normal distribution with mean zero and a, possibly parametrized, variance matrix Σ ( ϕ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Sigma (\phi )$$\end{document} .

1.1. Network Effects

Inside the random formulation of the relational event model, we identified three different effect types that are driving the interaction dynamics between the actors. In this section we focus on each of these three effect types in more detail.

Endogenous effects. When analysing social interactions, one might reasonably expect to see some adherence to social norms, such as reciprocity, triadic closure, or other interaction mechanism such as repetition and assortativity. These mechanisms may increase or decrease the propensity of occurrence of a given action. Reciprocity is a basic characteristic of social life, assuming that individuals tend to establish symmetric patterns of relational events. This dyadic effect describes the flow of exchange between two parties that does not occur simultaneously.

Triadic closure suggests that the presence of a common third party affects the relation between two individuals. However, there is more than just one way to define a triadic effect in a directed network, see Table 1. The cyclic closure describes the relations of generalised exchange, where each individual gives and eventually receives benefits from a different person. Behavioural studies indicate that an individual’s cooperative behaviour can be based on prior experiences, regardless of the identity of the other party (Rutte & Taborsky, Reference Rutte and Taborsky2007; Fischbacher et al., Reference Fischbacher, Gächter and Fehr2001; Isen, Reference Isen1987). This mechanism, known as generalized reciprocity (Pfeiffer et al., Reference Pfeiffer, Rutte, Killingback, Taborsky and Bonhoeffer2005) or indirect reciprocity (Yarmoshuk et al., Reference Yarmoshuk, Cole, Mwangu, Guantai and Zarowsky2020), assumes that previous receipt of help increases the propensity to help a stranger. Transitive closure describes the process of path shortening, whereby indirect connections between individuals tend to become direct ties over time. Triadic closure may occur as well through the tie formation arising from similarity in local network position. Sending balance (Vu et al., Reference Vu, Lomi, Mascia and Pallotti2017), also referred to activity-based structural homophily (Robins et al., Reference Robins, Pattison and Wang2009), assumes that two parties may create a tie based on their shared network activity. This effect is analogous to homophily, where the similarity in attributes leads to tie formation. An analogous process of structural homophily is called receiving balance or popularity closure. This effect is based on shared popularity, meaning that individuals may form a connection because they are chosen by the same third party.

Table 1 Common structural network effects for a directed network.

A special type of endogenous network effects are those associated with measures of nodal centrality reflected in degree-based statistics. A node’s emergent expansiveness can be quantified by the number of ties that originate from it, while the number of relational events that are directed towards a node, is an emergent proxy of its popularity. The sender out-degree statistic measures how expansive the current sender has been in the past, i.e., how often they initiated relational events in the past. It is often defined in an exponentially weighted form (Lerner et al., Reference Lerner, Bussmann, Snijders and Brandes2013),

sender out-degree ( i , t ) = ( i , j , t e ) , t e < t e - ( t - t e ) ln 2 T ln 2 T , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text{ sender } \text{ out-degree }(i,t) = \sum _{(i,j,t_e), t_e < t}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}, \end{aligned}$$\end{document}

where the sum is over all past events that included i as sender. The quantity T is a half-life parameter determining at which rate the weights of past events should be reduced. Relational events are weighted in order to give more importance to more recent events. Note that T = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T=\infty $$\end{document} corresponds to the unweighted sender out-degree. The sender in-degree measures how often the current sender was targeted by others in the past, i.e., this measure defines how popular the sender was in the past,

sender in-degree ( i , t ) = ( j , i , t e ) , t e < t e - ( t - t e ) ln 2 T ln 2 T , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {sender in-degree}(i,t) = \sum _{(j,i,t_e), t_e < t}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}, \end{aligned}$$\end{document}

The receiver out-degree measures how often the current receiver initiated relational events in the past, i.e., it measures how expansive the receiver has been up till now,

receiver out-degree ( j , t ) = ( j , i , t e ) , t e < t e - ( t - t e ) ln 2 T ln 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {receiver out-degree}(j,t) = \sum _{(j,i,t_e), t_e < t}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}. \end{aligned}$$\end{document}

The receiver in-degree measures how often the receiver was targeted by others in the past. It represents the popularity of the receiver,

receiver in-degree ( j , t ) = ( i , j , t e ) , t e < t e - ( t - t e ) ln 2 T ln 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {receiver in-degree}(j,t) = \sum _{(i,j,t_e), t_e < t}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}. \end{aligned}$$\end{document}

Another basic mechanism in relational event models is repetition, also known as inertia. Repetition refers to the tendency of past events to be repeated in the future. In particular, this effects represents the accumulated volume of events from actor i to actor j by time t:

repetition ( i , j , t ) = ( i , j , t e ) , t e < t e - ( t - t e ) ln 2 T ln 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {repetition}(i,j,t) = \sum _{(i,j,t_e), t_e < t}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}. \end{aligned}$$\end{document}

Other endogenous network effects may be used to capture various types of participation shifts that play a role in conversational norms (Butts, Reference Butts2008; Vu et al., Reference Vu, Lomi, Mascia and Pallotti2017). For example, a turn-taking effect describes the situation when the receiver takes over the initiative from the current sender. In this scenario, actor i initiates an event towards actor j, and subsequently, j initiates an event with an individual other than i. To measure this effect, we can use a statistic representing the elapsed time since the last event that satisfy the aforementioned conditions:

turn-taking ( j , k , t ) = max ( i , j , t e ) , t e < t , i k e - ( t - t e ) ln 2 T ln 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {turn-taking}(j,k,t) = \max _{(i,j,t_e),t_e < t, i\ne k}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}. \end{aligned}$$\end{document}

Another example is turn-continuing, which refers to scenarios where the sender is preserved in multiple relational events. Thus, it involves an event initiated by actor i towards actor j, followed by i initiating an event with another individual:

turn-continuing ( i , k , t ) = max ( i , j , t e ) , t e < t , j k e - ( t - t e ) ln 2 T ln 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {turn-continuing}(i,k,t) = \max _{(i,j,t_e),t_e < t,j\ne k}\textrm{e}^{-(t-t_e) \frac{\ln 2}{T}}\frac{\ln 2}{T}. \end{aligned}$$\end{document}

Often, these structural endogenous network effects have been considered as independent variables capturing network patterns influencing event occurrence. A sliding window technique or a weight function have been proposed to account for the temporal aspect of these fundamental endogenous drivers of social interactions. Another approach allowing for time-varying network effects suggests using stratified baseline hazard (Juozaitienė & Wit, Reference Juozaitienė and Wit2022). Stratification avoids making strong assumptions regarding the temporal structure of network formation mechanisms, such as monotonic decay parameters. This approach estimates the strata-specific baseline hazards,

λ ^ 0 s ( t ) = t Λ ^ 0 s ( t ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{\lambda }_{0s}(t) = \frac{\partial }{\partial t}\widehat{\Lambda }_{0s}(t), \end{aligned}$$\end{document}

where Λ ^ 0 s ( t ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\widehat{\Lambda }_{0s}(t)$$\end{document} is a smooth penalised spline estimate of a cumulative baseline hazard.

Exogenous effects. The proposed model (1 2) also may incorporate covariate effects representing sender and receiver monadic attributes, such as gender, or dyadic relations, such as living in the same neighbourhood or age difference. These covariates represents how exogenous forces shape network formation.

Random effects. In traditional relational event models, nodes are assumed to be homogeneous, except for the differences captured in available nodal covariates or in the past dynamics of the network. However, this assumption may be insufficient, especially in the context of social networks. The traits governing an individual’s sociality and popularity may be complex. For example, some people tend to communicate more actively based upon their personality traits, such as charisma, that is difficult to quantify. The heterogeneity of individuals can be very important to network formation and directly related to the hazard of experiencing an event since more resistant observations remain in the risk set longer (Steele, Reference Steele2003). Therefore, the inclusion of nodal random effects enriches the model by accounting for heterogeneity in the nodes of a network, which could not be captured otherwise.

The node specific random effects represent the propensity for individuals to send (expansiveness) and receive (popularity) ties. The expansiveness effect encapsulates all aspects related to an individual’s eagerness to initiate events. Similarly, popularity summarises all individual’s features that determine their attractiveness as a receiver. Many other random effects can be defined. In fact, random versions of all the above endogenous and exogenous variables can be considered.

1.2. Frailty Model Estimation

Following Therneau and Grambsch (Reference Therneau and Grambsch2000), an integrated partial likelihood for the model (1 2) is given as

IPL ( θ , ϕ ) = 1 ( 2 π ) q / 2 | Σ ( ϕ ) | 1 / 2 PL ( θ , b ) e - b T Σ - 1 b / 2 d b , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {IPL}(\theta ,\phi ) = \frac{1}{(2 \pi )^{q/2}|\Sigma (\phi )|^{1/2}}\int \text {PL}(\theta ,b) \textrm{e}^{-b^T\Sigma ^{-1}b/2}\textrm{d}b, \end{aligned}$$\end{document}

where q is the number of random effects and PL ( θ , b ) = o = 1 n λ i o j o ( t 0 ) / ( s , r ) R ( t o ) λ sr ( t 0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {PL}(\theta ,b) = \prod _{o=1}^n \lambda _{i_oj_o}(t_0)/\sum _{(s,r)\in \mathcal {R}(t_o)} \lambda _{sr}(t_0)$$\end{document} is Cox partial likelihood for any fixed values of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} and b. However, the IPL is an intractable multidimensional integral and to perform computations involving this likelihood we use the Laplace approximation. In this case, the log penalised partial likelihood (LPPL) is replaced with a second- order Taylor series about its value at the maximum of the function

LPPL ( θ , b , ϕ ) = LPL ( θ , b ) - 1 2 b T A - 1 ( ϕ ) b LPPL ( θ ^ ( ϕ ) , b ^ ( ϕ ) ) - 1 2 ( b - b ^ ) T H bb ( b - b ^ ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {LPPL}(\theta ,b,\phi )= & {} \text {LPL}(\theta ,b) - \frac{1}{2}b^TA^{-1}(\phi )b \\ {}\approx & {} \text {LPPL}(\hat{\theta }(\phi ),\hat{b}(\phi )) - \frac{1}{2} (b - \hat{b})^TH_{bb}(b - \hat{b}), \end{aligned}$$\end{document}

where H is the matrix of second derivatives of the LPPL and H bb \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{bb}$$\end{document} is the portion of this matrix corresponding to the random effects. When ϕ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi $$\end{document} is fixed, the relevant values of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} and b that maximize the LPPL can be computed using the same methods as a usual Cox regression model.

1.3. Likelihood-Ratio Test

We can formally test the significance of the random effects using a likelihood-ratio test, which compares the goodness of fit of two nested statistical models. The likelihood-ratio test statistic is defined as follows:

LR = - 2 ( ln L 0 - ln L 1 ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {LR} = -2(\ln L_0 - \ln L_1), \end{aligned}$$\end{document}

where L 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_0$$\end{document} and L 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_1$$\end{document} are the maximum likelihood values for the reduced and full models, respectively. This statistic has an asymptotic χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^2$$\end{document} distribution with degrees of freedom equal to the difference in the number of parameters between the two models. We propose to perform the likelihood-ratio test based on the approximate integrated partial likelihood.

2. Simulation Studies

In order to test the performance of the proposed framework and assess the consequences of neglecting nodal heterogeneity, we simulate and examine networks in four sets of analyses. The first two simulation studies analyse whether the proposed frailty approach recovers the true parameters, and subsequently how its estimates of the model parameters improve under increasing sample size scenarios. In the third simulation study we show how nodal heterogeneity induces ghost triadic effect, even when including traditional nodal degree statistics in the model. The objective of the fourth simulation study is to illustrate the usefulness of the partial likelihood-ratio test for the inclusion of the random effects.

2.1. Network Effects Recovery

A simulation-based experiment is conducted to demonstrate that the proposed approach is able to adequately recover the underlying parameters. To analyse the relative bias in parameters estimated with correctly specified frailty models, we consider the following simulation scenario. In each of 20 replications, we simulate 10,000 events among 100 individuals. The rate of each event is defined following the assumption that triadic closure effects have no impact on link occurrence. The baseline hazard function is set to be a constant equal to 1, assuming that waiting times are exponentially distributed. Therefore, the rate of each event depends only on the nodal random effects. The popularity and expansiveness random effects are generated from a normal distribution, i.e., b i pop N ( 0 , 1 . 3 2 ) , b i exp N ( 0 , 0 . 9 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b_i^{\text {pop}} \sim \mathcal {N}(0,1.3^2), b_i^{\text {exp}} \sim \mathcal {N}(0,0.9^2)$$\end{document} . Each dataset is analysed by the relational event model with frailty given in (1 2). Four stratified models are fitted to the generated datasets focusing on one triadic closure effect at a time.

Figure. 1 Boxplot of the estimated standard deviations of a expansiveness random effect (0.9) and b popularity random effect (1.3) in four different model formulations.

The standard deviations of the random effects are recovered to a fairly high level of accuracy (see Fig. 1). The proposed frailty model is able to estimate the random effect standard deviation values with no noticeable bias patterns, regardless of the triadic closure model. That is, both standard deviation estimates are centred on their true values (0.9 and 1.3, respectively). Moreover, the estimated smooth baseline hazard curves in Fig. 2 indicate that the frailty approach recovers the underlying process well. The proposed framework adequately recovers the shape of the underlying distribution with no obvious bias in its magnitude.

Figure. 2 True (dashed) and estimated (solid) baseline hazard curves: different colours represent the results of four random effect models focusing on different triadic closure effects. There is no obvious bias for any of the models.

2.2. Effect of Increasing Sample Size

To assess the effect of varying sample size on estimator performance, we vary two factors: number of individuals (i.e., the number of random effects levels) and the number of events (number of observations). For each scenario, we generate data under previously presented settings. Nodal heterogeneity is introduced in the form of the random effects that are drawn from a normal distribution with σ pop = 0.9 , σ exp = 0.5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\text {pop}} = 0.9, \sigma _{\text {exp}}=0.5$$\end{document} . To explore the importance of varying number of events we simulate three datasets under different scenarios, i.e., we simulate 500, 1500 and 4500 events among 100 individuals. Accordingly, to analyse the effect of varying number of individuals we simulate 10,000 events among 10, 30 and 90 individuals. Each dataset is analysed using the previously described frailty model.

Figure 3 summarises the results of 100 replications. Figure 3a demonstrates the effects of increasing the number of individuals on estimator performance by providing the estimated standard deviations for three different scenarios. The results for the popularity standard deviation are similar. As expected, the uncertainty around standard deviation estimates generally decreases with an increasing number of random effects levels in an approximate 1 / n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\sqrt{n}$$\end{document} fashion (here: proportional to 1 / 10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\sqrt{10}$$\end{document} , 1 / 30 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\sqrt{30}$$\end{document} and 1 / 90 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\sqrt{90}$$\end{document} , respectively). Results presented in Fig. 3b indicate that increasing the number of relational events has only a minor effect on random effect estimation accuracy. The reason is that the main limiting factor is the number of simulated random effects, which remains constant (here: 100).

Figure. 3 a Typical 1 / n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\sqrt{n}$$\end{document} effect of network size on the estimate of the expansiveness standard deviation; b almost no effect of number of relational events on the estimate of the popularity standard deviation.

2.3. Expansiveness and Popularity Induce Ghost Triadic Effects

Another simulation study is conducted to demonstrate the consequences of not accounting for nodal random effects in relational event models. In particular, we are interested in how heterogeneity affects the estimates of triadic closure effects. We consider the same simulation scenario as above, including only random node effects for popularity and expansiveness. We fit two models. The first includes only endogenous effects, such as reciprocity and triadic closure, whereas the second also includes traditional nodal degree statistics to account for nodal heterogeneity. Both models ignore the random effects, as in a traditional relational event model.

Figure. 4 a Phantom triadic effects are identified as a result of the presence of random popularity and expansiveness effects. b Degree- and intensity-based statistics are unable to fully account for nodal heterogeneity. In both plots the solid curves depict the overestimated triadic closure effects, while the dashed line represents the actual level of triadic closure.

In the first simulation study we only fit four triadic effects based on a stratified relational event model, that also includes reciprocity. Figure 4a shows solid lines of the estimated hazard functions for each triadic closure effect. The dashed line is the true baseline hazard function. It is immediately apparent that the estimates from the stratified relational event model severely overstate the triadic effects, particularly in the beginning. These results confirm that failure to account for nodal heterogeneity can vastly overestimate the actual amount of triadic closure present in the network. Furthermore, it is interesting to note the different bias pattern among the four triadic closure effects. The most severe bias is observed for the transitive closure effect. This suggests that transitive closure is the most sensitive to heterogeneity, and it responds to misspecification by significantly increasing in magnitude. Intuitively, this is due to the nature of transitive closure. This structural effect implies a local hierarchy, with one node only receiving ties, one node only sending ties and one neutral node receiving and sending ties. Therefore, the receiving node is the most popular within the group, and the sending node is the most expansive within it. For this reason, the effect of shared partners might be overestimated in order to compensate for underestimating the effects of social behaviour. On the other hand, in a cyclic closure, the direction of all ties is consistent and none of the three nodes would be singled out, which corresponds with the simulation study results indicating that cyclic closure is more robust to nodal heterogeneity.

Figure. 5 For each triadic closure effect, the boxplot represents the distribution of the parameter estimates obtained over 20 replications. In all cases, we observe positive receiver in-degree and sender out-degree effects.

It can be argued that our first analysis is too simplistic. The usual approaches to account for nodal heterogeneity suggest including degree- and intensity-based network effects. Therefore, for each triadic closure effect, we also estimate the model, incorporating sender and receiver in-degree and out-degree, repetition, turn-taking and turn-continuing effects. The distributions of the estimated effect sizes are summarized in Fig. 5. Notably, we observe positive estimates for sender out-degree and receiver in-degree effects, indicating that these network statistics can capture a portion of individuals’ popularity and expansiveness. However, the included degree- and intensity-based network effects do not entirely account for variations between individuals. Figure 4b shows that the estimated baseline hazard functions for each triadic effect still tend to overestimate the actual amount of triadic closure. Thus, we can conclude that degree- and intensity-based statistics can be used to reduce nodal heterogeneity to some extent. However, they are insufficient in fully accounting for nodal heterogeneity and obtaining credible estimates of other endogenous network effects, such as transitive closure.

2.4. Performance of Partial Likelihood-Ratio Test

The previous section demonstrates the severe consequences of a failure to account for nodal heterogeneity that is not explained by exogenous covariates. In fact, there is greater harm in excluding a random effect that is necessary than including a random effect that is not needed (Gelman & Hill, Reference Gelman and Hill2006). Nevertheless, from the point of view of parsimony and interpretation, we want a tool that is able to test for the need to include random effects. In this section, we test the partial likelihood-ratio test based on the integrated partial likelihood.

Figure. 6 QQplot comparing the uniform distribution to the p-values for partial likelihood ratio test, suggesting that the likelihood ratio test based on the approximate integrated partial likelihood is slightly conservative.

We simulate 10,000 events among 100 individuals assuming that waiting times are exponentially distributed. Spontaneous event times follow an exponential distribution with parameter λ sp = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{sp} = 1$$\end{document} , reciprocal events have a higher risk λ re = 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{re} = 2$$\end{document} , parameters for transitive and R+T strata events are, respectively, equal to λ tr = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{tr} =3$$\end{document} and λ r + t = 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{r+t}=4$$\end{document} . Thus, we assume that network dynamics is driven by the triadic effects, and there are no random effects.

For each dataset, we fit two stratified relational event models with and without frailty terms. Fitted models are compared using integrated version of the partial likelihood-ratio test. Figure 6 shows the QQplot comparing the quantiles of the calculated p-values to the quantiles of the uniform distribution. We can see that p-values are slightly conservative. This means that (i) this test tends not to include random effects, when they are not needed, and (ii) it requires a bit more evidence, when they are needed.

3. Illustrative Case Studies

While the previous section demonstrated the potential importance of directly modelling nodal heterogeneity in simulated network data, there remains a question as how useful the random popularity and expansiveness model is in practical, real-world problems. For this reason, in this section we analyse six real-world datasets as illustrative examples of the importance of accounting for nodal heterogeneity.

  1. 1. Manufacturing company (Michalski et al., Reference Michalski, Kajdanowicz, Bródka and Kazienko2014): a dynamic network describing the internal email communication between employees of a mid-sized manufacturing company. This study contains 82,614 email communications observed among 176 employees over 9 months period beginning in January 2010.

  2. 2. Enron email (Klimt & Yang, Reference Klimt and Yang2004): 2934 emails among 145 individuals between July 2001 and August 2001.

  3. 3. Classroom (Mcfarland, Reference Mcfarland2001): 691 communication events recorded between 20 individuals within a high school classroom. Data also contains two actor-level covariates defining the individual’s gender and role (i.e. student or teacher).

  4. 4. Phone calls (Sapiezynski et al., Reference Sapiezynski, Stopczynski, Lassen and Lehmann2019): 3600 phone calls among 540 students observed over a period of 4 weeks. The dataset was collected via smartphones as part of the Copenhagen Networks Study.

  5. 5. Social evolution (Madan et al., Reference Madan, Cebrian, Moturu and Farrahi2011): 439 phone calls observed among the 54 students residing in a university dormitory. The dataset also includes two exogenous variables: the floor of the dormitory on which the student resides, and the grade type of each student (i.e. freshmen, sophomore, junior, senior, or graduate tutors).

  6. 6. Virtual Battlespace 2 (VBS2) game (Pilny et al., Reference Pilny, Schecter, Poole and Contractor2016): 299 communications among 4 players who were engaged in a VBS game scenario.

We have slightly preprocessed datasets by removing instances when sender and receiver coincide, as well as events occurring simultaneously but having different senders, as such events might create situations where an open triad is created and closed at the same time. We did allow for multicast events, where communication events are sent to multiple receivers. In these applications, multiple events occurring at the exact same time are treated as ties. In order to deal with tied events, we use Efron’s approximation because it is accurate and computationally efficient (Therneau & Grambsch, Reference Therneau and Grambsch2000).

To analyse how nodal heterogeneity affects the estimates of triadic closure effects for each dataset we fit the stratified relational event frailty model with sender and receiver random effects. Both models include a list of endogenous network effects, i.e., in-degree and out-degree statistics for sending and receiving nodes, repetition, turn-taking, turn-continuing, reciprocity, triadic effects. For the email communication datasets (Manufacturing company and Enron email) we also included an off-set ln ( n r ( e ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ln (n_{r(e)})$$\end{document} , the natural logarithm of the number of receivers per email n r ( e ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_{r(e)}$$\end{document} .

Based on the AIC, we identify the most appropriate triadic effect for each dataset. We find that the transitive closure effect is the most suitable for the Manufacturing company data, while the sending balance effect is the most appropriate for Phone calls, Enron email and Social evolution datasets. For the Classroom data we use the cyclic closure effect. Only, the VBS2 game dataset does not exhibit any triadic closure effect resulting in the model including only the time-varying reciprocity effect. These estimates are shown in Table 2.

Table 2 Estimated reciprocity and triadic effects for fixed and mixed effects models, showing that unmodelled nodal heterogeneity severely distorts these effects in the manufacturing company, classroom and phone calls datasets.

Table 3 reports the model estimates for all datasets, including the result of the likelihood ratio test comparing the model with random effects to the model without random effect. Importantly, for 5 out of 6 examples analysed we find that the random effects are not only statistically significant, but also with substantial standard deviations. This is on top of various nodal degree-based statistics that are included by default. Only in the VBS2 game data is there little evidence for nodal heterogeneity, possibly due to the fact that there are only 4 nodes present. This demonstrates that the proposed approach combined with a likelihood ratio test is capable of evaluating whether the inclusion of random effects enhances model performance. It does not advocate for the blind inclusion of the frailty terms in all cases.

Table 3 Parameter estimates for the fixed and random effect standard deviations across all dataset (*p value < 0.05, p value < 0.1).

The results of the partial likelihood-ratio test indicate a significant preference for the model with nodal random effects in the majority of cases. Only for the VBS2 game dataset, the model without random effects is found to be sufficient

We also provide significant test for all the degree-based statistics. A number of them are significant, even in the presence of the random effects. This suggest that there is some level of emergence or virality in these dynamic networks. For instance, in the Manufacturing company data coefficients associated with the sender out-degree and receiver out-degree statistics suggest that the mere fact of having communicated or having been contacted in the past makes a person more likely to initiate or receive emails in the future, respectively. Nevertheless, both effects are quite small compared to the standard deviation of the expansiveness and popularity frailty terms. This means that intrinsic nodal heterogeneity is more important, than the viral effects. We also observe a modest positive turn-taking effect, indicating that individuals tend to continue the discussion thread initiated by others. Moreover, this model suggests that repetition has a small negative effect in email communication, indicating that the past events have a reduced tendency to be repeated in the future. This phenomenon can be attributed to various factors such as evolving topics, shifting communication needs, or the dynamic nature of conversations. As individuals engage in email exchanges, they naturally adapt their communication patterns, explore new topics, and respond to evolving circumstances. This dynamic nature of email communication may lead to a decreased likelihood of direct repetition of the same events over time. However, the effect sizes of the latter two effects are relatively small, suggesting that repetition and turn-taking are less prominent compared to other network effects.

We end the empirical analysis section, by a more careful inspection of the various reciprocity and triadic closure effects for each of the datasets. Table 2 displays how the reciprocity and triadic effects curves can be biased, when one fails to account for nodal heterogeneity. For instance, in the Manufacturing company example, failing to account for nodal heterogeneity seems to suggest a strong triadic closure tendency that rapidly decays over time. A similar situation can also be observed in the Classroom data and in the Phone call data. However, analysing a corresponding curve from the frailty model, which is supported by the likelihood ratio test, we can conclude that after accounting for nodal heterogeneity this triadic closure affect disappears. For the Enron email data and the Social evolution data, this effect is not so pronounced. In fact, we are quite confident that a triadic closure effect is really present in the Enron email data.

4. Conclusions

Given the complexity of real-world processes, nodal heterogeneity is likely to exist in most empirical networks. Common approaches to address heterogeneity include introducing various exogenous and endogenous network statistics. This is entirely sensible. However, the available exogenous and endogenous variables might not capture all the complexities of the generative process. For this reason, we stress the importance of being able to include frailty terms in the relational event model in order to account for this residual heterogeneity. Moreover, we argue that failing to account for heterogeneity can seriously affect inference about the strength of various endogenous network effects. Particularly, our work suggests that heterogeneity results in severely biased estimates of triadic closure effects. These results suggest that heterogeneity may also play a role in estimation of more complex network effects, including assortative matrix effects, four-cycles, etc. In addition, the case study analysis revealed that heterogeneity may also affect simpler network mechanisms, such as reciprocity. In this paper, we proposed using a relational even model frailty approach as a flexible tool to model heterogeneity in the network that is not otherwise captured in the available covariates. We also showed that likelihood ratio tests can be used to test for the need of including such frailty terms.

The theoretical benefits of accounting for nodal heterogeneity have been illustrated through a simulation study. Numerical experiments confirmed that a failure to account for nodal heterogeneity can vastly overstate the prevalence of the triadic closure effect present in the network. Furthermore, the findings revealed that the transitive closure effect, due to its nature, is the most sensitive to the nodal heterogeneity. Additionally, the nature of the structural effect is closely related to the strength of influence of expansiveness and popularity. These nodal random effects depending on the type of triadic effect have a different impact on the emergence of ghost triadic closure effects. Simulation studies also confirmed that the frailty approach is capable of producing accurate estimates of the underlying parameters. As expected, the precision of the standard deviation estimates increase with the increasing number of levels of the random effect, i.e., higher replication of the random effects results in more precise estimates. Assessing the performance of the partial likelihood-ratio test, we noted that the test produces only slightly conservative estimates, demonstrating a suitable way to test for the inclusion of the frailty terms.

This work revealed that nodal heterogeneity might disguise itself as a triadic closure effect when heterogeneity is not accounted for, i.e., if they are not well explained by the observed nodal characteristics. The suggested frailty approach is capable of recovering the temporal reciprocity and triadic closure curves, disentangling random nodal effects from triadic closure. The computational cost of the model is that of any traditional mixed effect model, making it possible to incorporate nodal random effects standardly in empirical studies.

Acknowledgements

This work was supported by an STSM Grant from COST Action COSTNET (CA15109). EW acknowledges funding by SNSF (Grants 188534, 192549).

Social evolution study: model output

The final random effects model selected also included two fixed effects, namely whether they lived on the same floor and whether they were in the same year. Both effects are positive, suggesting that sharing floor and year increases the rate of interaction.

Classroom study: model output

The final random effects model selected also included three fixed effects, namely whether the receiver is female, whether the sender is a teacher and whether the receiver is a teacher. The first effect is not significant, whereas teachers have higher sending propensity and a lower receiving propensity, compared to students.

Footnotes

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Artico, I, Wit, E.C.. (2023). Dynamic latent space relational event model. Journal of the Royal Statistical Society Series A: Statistics in Society, 186 3508529.CrossRefGoogle Scholar
Back, M.D.. (2015). Opening the process black box: Mechanisms underlying the social consequences of personality. European Journal of Personality, 29, 9196.CrossRefGoogle Scholar
Bianconi, G, Darst, R-K, Iacovacci, J, Fortunato, S. (2014). Triadic closure as a basic generating mechanism of communities in complex networks. Physical Review E, 90, 042806.CrossRefGoogle ScholarPubMed
Borgatti, S.P., Halgin, D.S.. (2011). On network theory. Organization Science, 22 511681181.CrossRefGoogle Scholar
Box-Steffensmeier, J.M., Campbell, B.W., Christenson, D, Morgan, J. (2019). Substantive implications of unobserved heterogeneity: Testing the frailty approach to exponential random graph models. Social Networks, 59, 141153.CrossRefGoogle Scholar
Box-Steffensmeier, J.M., Christenson, D.P., Morgan, J.W.. (2018). Modeling unobserved heterogeneity in social networks with the frailty exponential random graph model. Political Analysis, 26 1319.CrossRefGoogle Scholar
Butts, C-T. (2008). A relational event framework for social action. Sociological Methodology, 38 1155200.CrossRefGoogle Scholar
Butts, C.T., Lomi, A, Snijders, T.A., Stadtfeld, C. (2023). Relational event models in network science. Network Science, 11 2175183.CrossRefGoogle Scholar
Corbo, L, Corrado, R, Ferriani, S. (2016). A new order of things: Network mechanisms of field evolution in the aftermath of an exogenous shock. Organization Studies, 37 3323348.CrossRefGoogle Scholar
DuBois, C., Butts, C., & Smyth P. (2013). Stochastic blockmodeling of relational event dynamics. In Artificial intelligence and statistics. PMLR (pp. 238246).Google Scholar
Fischbacher, U, Gächter, S, Fehr, E. (2001). Are people conditionally cooperative? evidence from a public goods experiment. Economics Letters, 71 3397404.CrossRefGoogle Scholar
Foster, D-V, Foster, J-G, Grassberger, P, Paczuski, M. (2011). Clustering drives assortativity and community structure in ensembles of networks. Physical Review E, 84, 066117.CrossRefGoogle ScholarPubMed
Gelman, A, Hill, JData analysis using regression and multilevel/hierarchical models 2006 CambridgeCambridge University Press.CrossRefGoogle Scholar
Geukes, K, Breil, S.M., Hutteman, R, Nestler, S, Küfner, A.C., Back, M.D.. (2019). Explaining the longitudinal interplay of personality and social relationships in the laboratory and in the field: The pils and the connect study. PloS one, 14 1.CrossRefGoogle ScholarPubMed
Hinde, R.A.Towards understanding relationships 1979 LondonPublished in cooperation with European Association of Experimental Social Psychology by Academic Press.Google Scholar
Isen, A.M.. (1987). Positive affect, cognitive processes, and social behavior. Advances in Experimental Social Psychology, 20, 203253.CrossRefGoogle Scholar
Juozaitienė, R, Wit, E.C.. (2022). Non-parametric estimation of reciprocity and triadic effects in relational event networks. Social Networks, 68, 296305.CrossRefGoogle Scholar
Kevork, S, Kauermann, G. (2021). Iterative estimation of mixed exponential random graph models with nodal random effects. Network Science, 9 4478498.CrossRefGoogle Scholar
Klimek, P, Thurner, S. (2013). Triadic closure dynamics drives scaling-laws in social multiplex networks. New Journal of Physics, 15, 063008.CrossRefGoogle Scholar
Klimt, B., & Yang, Y. (2004). The enron corpus: A new dataset for email classification research. In European conference on machine learning (pp. 217226).CrossRefGoogle Scholar
Kumpula, J.M., Onnela, J-P, Saramäki, J, Kaski, K, Kertész, J. (2007). Emergence of communities in weighted networks. Physical Review Letters, 99 22.CrossRefGoogle ScholarPubMed
Lerner, J, Bussmann, M, Snijders, T.A., Brandes, U. (2013). Modeling frequency and type of interaction in event networks. Corvinus Journal of Sociology and Social Policy, 4 1332.CrossRefGoogle Scholar
Leskovec, J., Backstrom, L., Kumar, R., & Tomkins, A. (2008). Microscopic evolution of social networks. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08, New York, NY, USA (pp. 462470). ACM.CrossRefGoogle Scholar
Li, M, Zou, H, Guan, S, Gong, X, Li, K, Di, Z, Lai, C. (2013). A coevolving model based on preferential triadic closure for social media networks. Scientific Reports, 3, 2512.CrossRefGoogle ScholarPubMed
Lusher, D, Koskinen, J, Robins, GExponential random graph models for social networks: Theory, methods, and applications 2013 CambridgeCambridge University Press.Google Scholar
Madan, A, Cebrian, M, Moturu, S, Farrahi, K et.al Sensing the “health state” of a community IEEE Pervasive Computing 2011 11 43645.CrossRefGoogle Scholar
Mcfarland, D. (2001). Student resistance: How the formal and informal organization of classrooms facilitate everyday forms of student defiance. American Journal of Sociology, 107, 612678.CrossRefGoogle Scholar
McPherson, M, Smith-Lovin, L, Cook, J.M.. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27 1415444.CrossRefGoogle Scholar
Michalski, R, Kajdanowicz, T, Bródka, P, Kazienko, P. (2014). Seed selection for spread of influence in social networks: Temporal vs. static approach. New Generation Computing, 32 3–4213235.CrossRefGoogle Scholar
Newman, M-E-J, Park, J. (2003). Why social networks are different from other types of networks. Physical Review E, 68, 036122.CrossRefGoogle ScholarPubMed
Olk, P.M., Gibbons, D.E.. (2010). Dynamics of friendship reciprocity among professional adults. Journal of Applied Social Psychology, 40 511461171.CrossRefGoogle Scholar
Perry, P, Wolfe, P. (2013). Point process modeling for directed interaction networks. Journal of the Royal Statistical Society, 75 5821849.CrossRefGoogle Scholar
Pfeiffer, T, Rutte, C, Killingback, T, Taborsky, M, Bonhoeffer, S. (2005). Evolution of cooperation by generalized reciprocity. Proceedings of the Royal Society B: Biological Sciences, 272 156811151120.CrossRefGoogle ScholarPubMed
Pilny, A, Schecter, A, Poole, M.S., Contractor, N. (2016). An illustration of the relational event model to analyze group interaction processes. Group Dynamics: Theory, Research, and Practice, 20 3181195.CrossRefGoogle Scholar
Raush, H.L.. (1965). Interaction sequences. Journal of Personality and Social Psychology, 2 4487.CrossRefGoogle ScholarPubMed
Robins, G, Pattison, P, Wang, P. (2009). Closure, connectivity and degree distributions: Exponential random graph (p*) models for directed social networks. Social Networks, 31 2105117.CrossRefGoogle Scholar
Rutte, C, Taborsky, M. (2007). Generalized reciprocity in rats. PLoS Biology, 5 7.CrossRefGoogle ScholarPubMed
Sapiezynski, P, Stopczynski, A, Lassen, D.D., Lehmann, S. (2019). Interaction data from the Copenhagen networks study. Scientific Data, 6 1315.CrossRefGoogle ScholarPubMed
Snijders, T, van de Bunt, G, Steglich, C. (2010). Introduction to stochastic actor-based models for network dynamics. Social Networks, 32, 4460.CrossRefGoogle Scholar
Snijders, T.A.. (2017). Stochastic actor-oriented models for network dynamics. Annual Review of Statistics and Its Application, 4, 343363.CrossRefGoogle Scholar
Stadtfeld, C, Block, P. (2017). Interactions, actors, and time: Dynamic network actor models for relational events. Sociological Science, 4 14318352.CrossRefGoogle Scholar
Steele, F. (2003). A discrete-time multilevel mixture model for event history data with long-term survivors, with an application to an analysis of contraceptive sterilization in bangladesh. Lifetime Data Analysis, 9 2155174.CrossRefGoogle Scholar
Therneau, T, Grambsch, PModeling survival data: Extending the Cox model 2000 New YorkSpringer.CrossRefGoogle Scholar
Thiemichen, S, Friel, N, Caimo, A, Kauermann, G. (2016). Bayesian exponential random graph models with nodal random effects. Social Networks, 46, 1128.CrossRefGoogle Scholar
Uzaheta, A, Amati, V, Stadtfeld, C. (2023). Random effects in dynamic network actor models. Network Science, 11 2249266.CrossRefGoogle Scholar
Vu, D, Lomi, A, Mascia, D, Pallotti, F. (2017). Relational event models for longitudinal network data with an application to interhospital patient transfers. Statistics in Medicine, 36 1422652287.CrossRefGoogle ScholarPubMed
Yarmoshuk, A.N., Cole, D.C., Mwangu, M, Guantai, A.N., Zarowsky, C. (2020). Reciprocity in international interuniversity global health partnerships. Higher Education, 79 3395414.CrossRefGoogle Scholar
Figure 0

Table 1 Common structural network effects for a directed network.

Figure 1

Figure. 1 Boxplot of the estimated standard deviations of a expansiveness random effect (0.9) and b popularity random effect (1.3) in four different model formulations.

Figure 2

Figure. 2 True (dashed) and estimated (solid) baseline hazard curves: different colours represent the results of four random effect models focusing on different triadic closure effects. There is no obvious bias for any of the models.

Figure 3

Figure. 3 a Typical 1/n\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$1/\sqrt{n}$$\end{document} effect of network size on the estimate of the expansiveness standard deviation; b almost no effect of number of relational events on the estimate of the popularity standard deviation.

Figure 4

Figure. 4 a Phantom triadic effects are identified as a result of the presence of random popularity and expansiveness effects. b Degree- and intensity-based statistics are unable to fully account for nodal heterogeneity. In both plots the solid curves depict the overestimated triadic closure effects, while the dashed line represents the actual level of triadic closure.

Figure 5

Figure. 5 For each triadic closure effect, the boxplot represents the distribution of the parameter estimates obtained over 20 replications. In all cases, we observe positive receiver in-degree and sender out-degree effects.

Figure 6

Figure. 6 QQplot comparing the uniform distribution to the p-values for partial likelihood ratio test, suggesting that the likelihood ratio test based on the approximate integrated partial likelihood is slightly conservative.

Figure 7

Table 2 Estimated reciprocity and triadic effects for fixed and mixed effects models, showing that unmodelled nodal heterogeneity severely distorts these effects in the manufacturing company, classroom and phone calls datasets.

Figure 8

Table 3 Parameter estimates for the fixed and random effect standard deviations across all dataset (*p value < 0.05, p value < 0.1).