Introduction
Creativity is the capacity to use imagination and inventiveness to bring into existence original ideas, solutions, or products. Ideas and products are judged as creative to the extent that they provide both an original and valuable solution to the problem at hand (Srinivasan and Chakrabarti, Reference Srinivasan and Chakrabarti2010; Moldovan et al., Reference Moldovan, Goldenberg and Chattopadhyay2011; Casakin and Georgiev, Reference Casakin and Georgiev2021), and the problem is heuristic rather than algorithmic (Amabile, Reference Amabile1983a, Reference Amabile1983b). If an idea is not new but already known, then we could just algorithmically copy the known idea investing no creative efforts. Alternatively, if a new solution of a posed problem is not useful, we would consider bringing it into existence as a waste of available resources. Therefore, originality (novelty) and usefulness (value) of ideas interweave together as parts of a single creativity construct (Stein, Reference Stein1953; Wang, Reference Wang2013; Taura, Reference Taura2016; Lee et al., Reference Lee, Ostwald and Gu2020).
The most creative ideas are both novel and useful, and the most creative people excel at both creativity dimensions. Descriptive psychological accounts of creativity (Guilford, Reference Guilford1957; Hudson, Reference Hudson1974; Runco, Reference Runco2004, Reference Runco2007; Runco and Pritzker, Reference Runco and Pritzker2020) postulate that novelty is generated through divergent (associative, analogical) thinking that breaks assumptions and rules, while usefulness is enhanced through convergent (analytical) thinking that adheres to needs, boundaries, and constraints (Miron-Spektor and Erez, Reference Miron-Spektor, Erez, Smith, Lewis, Jarzabkowski and Langley2017). Both convergent or divergent paths are instructed in the design process (Goldschmidt, Reference Goldschmidt2016, Reference Goldschmidt2019). The process of innovative abduction is also related to both divergent and convergent thinking (Dong et al., Reference Dong, Garbuio and Lovallo2016). Such descriptive accounts could be implemented in artificial intelligence machines only if translated into quantitatively precise procedures.
Cognitive processes and structures that underpin creative thinking and help produce creative acts and results are referred to as creative cognition (Finke et al., Reference Finke, Smith and Ward1992). While the dynamic cognitive processes in creative contexts have been in the focus of research (Wilkenfeld and Ward, Reference Wilkenfeld and Ward2001; Sonalkar et al., Reference Sonalkar, Mabogunje, Leifer and Roth2016), objective tools for evaluation of creative cognition have only recently been developed (Georgiev and Georgiev, Reference Georgiev and Georgiev2018, Reference Georgiev and Georgiev2019; Georgiev and Casakin, Reference Georgiev and Casakin2019; Han et al., Reference Han, Hua, Park, Wang and Childs2020, Reference Han, Sarica, Shi and Luo2021, Reference Han, Sarica, Shi and Luo2022; Chiu et al., Reference Chiu, Lim and Silva2023). For example, dynamic semantic networks have been used to analyze the datasets from the 10th Design Thinking Research Symposium (DTRS 10) with design review conversations between design students and real clients recorded in educational settings (Adams and Siddiqui, Reference Adams and Siddiqui2013, Reference Adams and Siddiqui2015; Georgiev and Georgiev, Reference Georgiev and Georgiev2018) and from the 12th Design Thinking Research Symposium (DTRS 12) with design discussions in a company context (Christiaans, Reference Christiaans and Christiaans2018; Georgiev and Georgiev, Reference Georgiev and Georgiev2023). Several semantic measures were found to be useful for quantitative evaluation of convergence/divergence in creative thinking and supported the role of divergent thinking for the success of generated design ideas (Georgiev and Georgiev, Reference Georgiev and Georgiev2018, Reference Georgiev and Georgiev2023). Procedures for real-time application of dynamic semantic networks in creative problem solving, however, are currently lacking. This necessitates the development of a detailed workflow for real-time application of dynamic semantic networks for monitoring and potential support of design creativity.
Main hypothesis and aims of the study
The main hypothesis of this work is that dynamic semantic networks could be used in real time to predict the success of creative design ideas. To test this hypothesis, first we aimed to develop a complete workflow for real-time application of dynamic semantic networks, using a moving time window for monitoring the cognitive processes during creative design. Second, we aimed to identify particular quantitative measures of information content and semantic similarity that highly correlate with human evaluation of word similarity in order to employ those measures in the constructed dynamic semantic networks for modeling creative cognition. Third, we aimed to evaluate the actual performance of the developed workflow through back-testing the dynamic semantic networks for predicting the success of design ideas using transcribed design review conversations from the DTRS 10 dataset.
Aim 1: Workflow for monitoring of creative cognition
The inner privacy of consciousness poses unique challenges to understanding cognitive processes (Georgiev, Reference Georgiev2017, Reference Georgiev2020a, Reference Georgiev2020b). Experiences, feelings, emotions, thoughts, and beliefs constitute one’s consciousness, but the phenomenological qualia of these conscious states are not directly accessible by external observers (Nagel, Reference Nagel1974). Instead, the individual conscious states need to be externalized through expression into classical bits of Shannon information (Shannon, Reference Shannon1948) such as words, gestures, or images. Because verbalized reports of the individual stream of consciousness concurrent with the execution of a given cognitive task can be used reliably as data (Ericsson and Simon, Reference Ericsson and Simon1980) for subsequent analysis with natural language processing scripts (Bird et al., Reference Bird, Klein and Loper2009), we have developed a workflow for monitoring of creative cognition based on design conversations that occur during the development of a particular design product.
The overall structure of the workflow consists of two stages: natural language processing and semantic network processing (Figure 1). The first stage consists of five steps, all of which can be automated with the use of available Python libraries and scripts:
-
(1) audio recording of the design conversation,
-
(2) speech-to-text conversion,
-
(3) part-of-speech tagging for detection of nouns (concepts),
-
(4) construction of a moving time window and
-
(5) removal of duplicates.
The second stage consists of 2 steps:
-
(6) construction of semantic networks based on WordNet, including computation of quantitative semantic measures from graphs, and
-
(7) dynamic statistical fit of trendlines for the selected semantic measures for detection of convergence or divergence in creative thinking.
In this work, we consider that the engineering solutions for natural language processing needed to accomplish stage 1 of the workflow are readily available (Bird et al., Reference Bird, Klein and Loper2009; Loria, Reference Loria2016; Lee, Reference Lee2024). Therefore, we will dedicate our efforts to providing detailed procedures for executing the semantic network processing using WordNet in stage 2 of the workflow. We will also elaborate on different graph theoretical alternatives for computation of information content, which is a measure of the surprisal due to the occurrence of a particular concept, and semantic similarity, which is a measure of how close the meanings of two concepts are. The dynamic trendlines for information content and semantic similarity obtained from recorded design conversations will then be backtested for correlation with the eventual success of developed design products using the DTRS10 dataset and we will propose potential application of dynamic semantic networks for real-time support or enhancement of design creativity.
Aim 2: Modeling creative cognition with dynamic semantic networks
Replacement of subjective expert coding with objective computer algorithms
Creativity could be modeled computationally (Sosa and van Dijck, Reference Sosa and van Dijck2021). The theoretical foundations of automated problem solving are based on automated reasoning systems based on heuristic search techniques such as general problem solver (Newell et al., Reference Newell, Shaw and Simon1959). Early theories of problem-solving use computer simulations to predict human performance, explain the underlying processes and mechanisms, account for incidental phenomena, show how performance changes under different conditions, and explain how problem-solving skills are learned (Simon and Newell, Reference Simon and Newell1971). The challenges of computational problem-solving in design revolve around encoding, representation, constraint analysis, and reduction of the effective solution space (Perkowski, Reference Perkowski2022).
Protocol analysis of concurrent verbalization from professional design teams allows for the identification of reasoning patterns in idea generation (Cramer-Petersen et al., Reference Cramer-Petersen, Christensen and Ahmed-Kristensen2019), dynamic process patterns in design communication (Cash et al., Reference Cash, Dekoninck and Ahmed-Kristensen2020), success of design ideation (Maccioni and Borgianni, Reference Maccioni, Borgianni, Boujut, Cascini, Ahmed-Kristensen, Georgiev and Iivari2020) or evaluation of external sources of inspiration (Borgianni et al., Reference Borgianni, Maccioni, Fiorineschi and Rotini2020). Typically, the protocol analysis requires protocol coding performed by an expert, which invariably introduces a level of subjectivity that limits reproducibility by independent research teams. Furthermore, significant attention has been focused on the subjective nature of evaluating creativity of design ideas using metrics (Fiorineschi and Rotini, Reference Fiorineschi and Rotini2023). The subjective judgments required in metrics can significantly impact the evaluation of design creativity, owing to differences in how evaluators perceive and prioritize different aspects of the design ideas (Fiorineschi et al., Reference Fiorineschi, Frillici and Rotini2022). Hence, further research is needed to address these challenges and develop more robust methods for evaluating design ideas (Borgianni et al., Reference Borgianni, Maccioni, Fiorineschi and Rotini2020). To dispense with the reliance on experts and ensure maximal reproducibility, here we describe an objective method for textual processing and construction of semantic networks that is fully automated by applied computer algorithms.
Semantic networks as graphs
Semantic networks represent knowledge in the form of graphs that consist of vertices and edges (Diestel, Reference Diestel2017). Vertices indicate individual concepts, whereas edges between pairs of vertices indicate specific semantic connections (Boden, Reference Boden2004). Graphs could be divided into directed or undirected depending on the type of edges contained in the graph. The lack of loops, which are edges that connect vertices to themselves, and multiple edges with the same source and target vertices generates a simple directed graph. The lack of directed cycles generates a directed acyclic graph (DAG). The underlying undirected graph of a DAG, however, could contain cycles (Figure 2). Further introduction of a root vertex such that all edges of the directed graph are directed either away from or towards the root generates a rooted directed acyclic graph. Assigning different weights to the edges of directed or undirected graphs generates weighted graphs.
WordNet is a lexical database, which represents human knowledge in a graph form (Miller et al., Reference Miller, Beckwith, Fellbaum, Gross and Miller1990; Miller, Reference Miller1995, Reference Miller and Fellbaum1998; Fellbaum, Reference Fellbaum1998a, Reference Fellbaum1998b). This is particularly suitable for imposing a distance function onto constructed semantic networks, which in turn enables quantitative exploration of dynamic cognitive processes such as creative cognition. Throughout this work, we employ WordNet 3.1, which is represented by a rooted directed acyclic graph of meanings (Figure 2). The constructed semantic networks are represented by undirected cyclic weighted graphs (Figure 3).
Semantic networks as models of the creative mind
Semantic networks constructed from conversation transcripts represent computational models of conceptual associations and structures (Georgiev et al., Reference Georgiev, Nagai and Taura2008, Reference Georgiev, Nagai and Taura2010; Han et al., Reference Han, Hua, Park, Wang and Childs2020, Reference Han, Sarica, Shi and Luo2021, Reference Han, Sarica, Shi and Luo2022). Individual concepts in the semantic network need not be only single words but could also be phrases combining different parts of speech (Esparza et al., Reference Esparza, Sosa and Connor2019), and can employ specific technical concepts (Sarica et al., Reference Sarica, Song, Luo and Wood2021). However, for ensuring reproducibility and objectiveness, the extraction of concepts for the semantic networks should be based on a set of rules that can be automated by a computer program without the need of human intervention. This problem is addressed satisfactorily by identifying individual concepts with single words that could be further narrowed down to a single lexical category (nouns) with the use of part-of-speech tagging performed by natural language processing software (Georgiev and Georgiev, Reference Georgiev and Georgiev2018).
Polysemy necessitates simultaneous use of words and meanings
Polysemy is essential for creative conceptual blending and associative thinking (Ravin and Leacock, Reference Ravin and Leacock2000; Fauconnier and Turner, Reference Fauconnier, Turner, Nerlich, Zazie, Vimala and Clarke2003; Nerlich et al., Reference Nerlich, Zazie, Vimala and Clarke2003; Georgiev and Taura, Reference Georgiev and Taura2014). In particular, designer’s mind could work simultaneously with several meanings of a word, similarly to how writers employ polysemy in humorous works (Boxman-Shabtai and Shifman, Reference Boxman-Shabtai and Shifman2014). Meaning is deemed an essential component of creativity (Sääksjärvi and Gonçalves, Reference Sääksjärvi and Gonçalves2018). To capture the possible impact of polysemous words in design thinking, we employ a computational method that does not compromise, reduce, or disambiguate between the multiple senses. The linguistic distinction between words and meanings could be approached in two different ways. Inclusion of a pre-processing step called disambiguation of senses could convert all words into senses. This would modify the verbal data through injection of interpretation even before the data analysis is started and would delete potentially useful information about underlying difficult-to-observe cognitive processes. For example, the nouns entering into the description of creative ideas may acquire different senses that disagree with those listed in a dictionary (Georgiev and Taura, Reference Georgiev and Taura2014;Taura and Nagai, Reference Taura and Nagai2013). Furthermore, the polysemy of nouns was found to be instrumental in the association of ideas, which were previously not considered to be related by the creative problem solver (Georgiev and Taura, Reference Georgiev and Taura2014). To avoid the latter shortcomings, we construct semantic networks from verbal data without any disambiguation of senses of extracted words (nouns) (Georgiev and Georgiev, Reference Georgiev and Georgiev2018), but preserving two types of vertices in the semantic network, word vertices, and meaning vertices. The benefit is that discovered functional relationships in the semantic network could be correlated to the neural activity in specialized brain cortical areas such as Broca’s area, which translates meanings into words, and Wernicke’s area, which translates words into meanings (Georgiev et al., Reference Georgiev, Georgieva, Gong, Nanjappan and Georgiev2021). The utility of semantic networks of nouns for studying creativity and reconstruction of difficult-to-observe cognitive processes in conceptual design was demonstrated previously by different research teams with the use of several experimental datasets (Georgiev et al., Reference Georgiev, Nagai and Taura2008, Reference Georgiev, Nagai and Taura2010; Yamamoto et al., Reference Yamamoto, Goka, Yusof, Taura, Nagai, Bergendahl, Grimheden, Leifer, Skogstad and Lindemann2009; Taura et al., Reference Taura, Yamamoto, Fasiha, Goka, Mukai, Nagai and Nakashima2012).
Creative cognition modeled as a dynamic semantic network
To assess the dynamic aspects of cognitive processes, the constructed semantic network should be allowed to evolve in time (Figure 3). One inefficient approach is to consider the cumulative growth of the semantic network as new concepts appearing in the transcribed conversations are added to those concepts that have already appeared. This cumulative approach creates large networks fast even for relatively short conversations, which produces a large sample size for statistical analyses. The disadvantage is that it retains in the semantic network concepts that may have been briefly considered but then discarded by the cognitive processes underlying the given task. An alternative, much more efficient approach is to coarse–grain the time into time intervals, each of which will encompass a corresponding part of the conversation (Georgiev and Georgiev, Reference Georgiev and Georgiev2018). The advantage of the noncumulative approach is that concepts are retained in the semantic network only if they repeatedly appear in the course of the problem solving conversation. The minimal time interval should be coarse–grained to contain a sufficient number of concepts to form a meaningful network. For real-time monitoring of verbal output, the semantic network dynamics could be tracked with a moving time window allowing for dynamic update of the semantic measures with each new word added in the conversation.
WordNet structure as a graph of meanings and words
Lexical categories in WordNet
WordNet 3.1 is publicly available (http://wordnet.princeton.edu), lexical database for English created under the direction of G. A. Miller and hosted at Princeton University (Miller et al., Reference Miller, Beckwith, Fellbaum, Gross and Miller1990; Miller, Reference Miller1995; Fellbaum, Reference Fellbaum1998a, Reference Fellbaum1998b). WordNet contains four subnets that correspond to four basic lexical categories: nouns, verbs, adjectives, and adverbs (Miller, Reference Miller and Fellbaum1998; Fellbaum, Reference Fellbaum1998b). Because there are only a few cross-subnet pointers (Fellbaum, Reference Fellbaum1998b), calculation of graph-theoretic distances between words could be achieved by confining the constructed semantic networks to a single subnet. The subnet of nouns provides the largest and deepest hierarchical taxonomy in WordNet, which can be efficiently used for the construction of semantic networks of nouns. In an experimental study using design review conversations, it was found that over 99.8% of all nouns used in the conversations are also listed in WordNet 3.1 (Georgiev and Georgiev, Reference Georgiev and Georgiev2018). Further motivation for working with nouns in research of creative problem solving is provided by developmental linguistics findings that the category corresponding to nouns is, at its core, conceptually simpler or more basic than those corresponding to verbs and other parts of speech, which is exemplified by infants early advantage for learning nouns over verbs (Gentner, Reference Gentner and Kuczaj1982; Waxman et al., Reference Waxman, Fu, Arunachalam, Leddon, Geraghty and Song2013). In addition, experimental creativity research has shown that networks of nouns stimulate the generation of creative ideas (Georgiev and Georgiev, Reference Georgiev and Georgiev2019), different combinations of nouns and relations between nouns are associated with a display of creative thought (Dong, Reference Dong2009), and dissimilarity of noun pairs produces a higher number of emergent features of creative ideas (Wilkenfeld and Ward, Reference Wilkenfeld and Ward2001).
Hypernym-hyponym hierarchy of nouns
The two basic semantic relations in WordNet are synonymy, where sets of word synonyms (synsets) form the basic building blocks of the lexical hierarchy and hyponymy (subordination of synsets) where if a hyponym (subordinate) X is subsumed by a hypernym (superordinate) Y then it follows that “An X is a kind of Y” (Miller et al., Reference Miller, Beckwith, Fellbaum, Gross and Miller1990; Miller, Reference Miller and Fellbaum1998). The hypernym-hyponym (is-a) relationship provides a taxonomy of nouns in WordNet as follows: the root synset {entity} subsumes directly three different synsets, which can be viewed as classifying entities into {abstract_entity, abstraction}, {thing} or {physical_entity}. Each of these synsets then subsumes directly other synsets, further classifying classes of entities into subclasses, and so on. The hypernym-hyponym (is-a) hierarchy of nouns in WordNet is conceptually clear if it is represented in the form of a graph with two types of vertices: meaning vertices and word vertices.
The distinction between words and meanings would not have been required if there were one-to-one relationship between words and meanings. In natural language, however, one word can have several meanings (polysemy) and one meaning can be expressed by several words. For example, the word “horse” has five meanings thereby entering into five synsets as follows: M02377103 with synset {Equus_caballus, horse}, M03543217 with synset {gymnastic_horse, horse}, M03629976 with synset {horse, knight}, M04147696 with synset {buck, horse, sawbuck, sawhorse}, or M08414813 with synset {cavalry, horse, horse_cavalry}. Here, ‘M’ stands for meaning, the subsequent digits indicate the number by which the particular synset is referred to in WordNet. Explicitly labeling the meaning vertices with a numeric code further clarifies the significance of the synset and avoids possible misunderstanding of the synset as a list of words—in fact, the synset stands only for the meaning that is in common for all words in the list. For example, the meaning of M02377103 with synset {Equus_caballus, horse} is “the particular animal species in the genus Equus,” whereas the meaning of M03543217 with synset {gymnastic_horse, horse} is “an artistic gymnastics apparatus.” The fact that the meaning is the essential ingredient of a synset can be illustrated with a somewhat rare example of a meaning, which is difficult to guess from the words in the synset alone: the meaning of M03629976 with synset {horse, knight} is actually “a chess piece shaped to resemble the head of a horse” – this has to be understood from a sample sentence used in WordNet to clarify the meaning.
Composition of word and meaning subgraphs
The lexical hierarchy of nouns in WordNet 3.1 is comprised of 158,441 word vertices and 82,192 meaning vertices. Word vertices and meaning vertices are organized in two subgraphs, subgraph M, composed of 84,505 meaning → meaning edges between hypernyms and hyponyms, and subgraph W, composed of 189,555 word → meaning edges. The subgraph M is a rooted directed acyclic graph, which has as a root the meaning vertex M00001740 corresponding to the single-word synset {entity}. To compute graph theoretic measures for words, however, the subgraph M should be expanded with edges extracted from the subgraph W. For example, if we are interested in any semantic measure characterizing the word “horse,” we will need to extract the five edges from W that connect “horse” to each of its five meaning vertices. Only in the composite graph containing both meanings and words, we are able to comprehend the content of transcribed conversation.
The reason for appending extracted word edges to the graph M, instead of working in the full graph $ M\cup W $ , is to achieve computational efficiency, namely, the subgraph W, which is effectively discarded, has twice as many edges as the subgraph M and finding shortest paths is much faster in smaller graphs. Additional flexibility is achieved by the possible application of directed graph operators such as R(G), whose action on the graph G is to reverse the direction of all edges, or U(G), whose action on the graph G is to remove the directionality of all edges (Georgiev and Georgiev, Reference Georgiev and Georgiev2018). As an example, W contains word → meaning edges, R(W) contains meaning → word edges, and U(W) contains meaning – word edges. In this way, it is possible to add words as subordinate vertices (subsumed by meanings) or as superordinate vertices (subsumers of meanings) depending on the semantic measures that need to be computed (Figure 4).
Semantic measures based on WordNet
Graph-theoretic functions
Defining semantic measures for words from the graph structure of WordNet requires the introduction of several basic functions that take as arguments a graph, denoted with capital letter, and one or more vertices, denoted with small letters (Georgiev and Georgiev, Reference Georgiev and Georgiev2018, Reference Georgiev and Georgiev2023). Hereafter, general type of vertices, either meanings or words, will be denoted as v 1, v 2,…, vn, whereas word vertices will be specified as w 1, w 2,…, wn.
Functions in a general graph
-
$ \mathcal{J}\left(G,v\right) $ lists all edges that are incident to vertex v in the graph G.
-
$ \mathcal{A}\left(G,v\right) $ lists all vertices that are adjacent to vertex v in the graph G.
Functions in a directed graph
-
$ \mathcal{V}\left(G,v\right) $ lists all subvertices of vertex v in the graph G, where subvertices are all vertices with finite directed path from v.
-
$ \mathcal{S}\left(G,v\right) $ lists all subsumers of vertex v in the graph G, where subsumers are all vertices with finite directed path to v.
-
$ \mathcal{L}\left(G,v\right) $ lists the leaves of a vertex v in the graph G, where leaves are all subvertices of v with a vertex out-degree of zero.
-
$ \mathcal{D}\left(G,{v}_1,{v}_2\right) $ gives the shortest path distance measured in edges from vertex v 1 to vertex v 2 in the graph G, where the output is infinite ∞ if there is no path from v 1 to v 2.
Functions in a rooted directed graph
$ \mathcal{T}\left(G,v\right) $ gives the depth in the taxonomy of a vertex v measured as the number of vertices on the shortest path from the root vertex r to v in the graph G.
Set theoretic functions
|f (x)| counts the number of elements in the list f (x).
Semantic measures for single words
The level of abstraction, polysemy and information content are semantic measures applicable to single words. For a semantic network composed of n word vertices, the average for each of these semantic measures could be determined with n searches in WordNet.
Level of abstraction
The level of abstraction of a word is the complement to unity of the word concreteness. For nouns, both measures are related to the noun depth in the WordNet taxonomy such that nouns located higher in the hierarchy are more abstract and less concrete, whereas nouns located lower in the hierarchy are less abstract and more concrete (Meng et al., Reference Meng, Huang and Gu2013; Georgiev and Georgiev, Reference Georgiev and Georgiev2018). In graph theoretic notation, the level of abstraction of the word w is
where $ {\mathcal{T}}_{\mathrm{max}}=19 $ is the maximal depth of WordNet 3.1 taxonomy and $ \mathcal{T}(w) $ is the depth of the word w computed as the shortest path distance between the root meaning vertex M00001740 with synset {entity} and the word vertex w in the graph $ M\cup \mathcal{J}\left[R(W),w\right] $ .
Polysemy
The polysemy measures the number of different meanings possessed by a given word. About 44% of English words are polysemous, which means that they have more than one meaning (Britton, Reference Britton1978). The log transformed value of polysemy quantifies the missing bits of information required for correct understanding of the intended meaning of a given word. That missing information is usually extracted from the context of the conversation. For monosemous words, which have only one meaning, there is no ambiguity and no information is missing. In graph theoretic notation, the polysemy is
which counts the number of all meaning vertices that are adjacent to the word vertex w.
Information content
The intrinsic information content of words or meanings is measured in bits solely from the graph-theoretic structure of WordNet 3.1. For comparison of different formulas that measure the information content of words in WordNet 3.1, however, it is helpful to work with normalized values in the unit interval [0,1]. To write compactly the formulas for information content, we will need the following word functions.
For a given word w: $ \mathcal{S}(w) $ lists the meaning subsumers in the graph $ M\cup \mathcal{J}\left[R(W),w\right] $ , $ \mathcal{V}(w) $ lists the meaning subvertices in the graph $ M\cup \mathcal{J}\left(W,w\right) $ , $ \mathcal{H}(w) $ lists the meaning hyponyms in the set difference $ \mathcal{V}(w)\backslash \left[\mathcal{A}\left(W,w\right)\cup w\right],\mathcal{L}(w) $ lists the leaves in the graph $ M\cup \mathcal{J}\left(W,w\right) $ , and $ \mathcal{C}(w) $ computes the commonness $ {\sum}_{i\in \mathcal{L}\left(G,w\right)}\frac{1}{S\left(M,i\right)} $ in the graph $ M\cup \mathcal{J}\left(W,w\right) $ .
Several constants specific for WordNet 3.1 are also useful: $ {\mathcal{V}}_{\mathrm{max}}=82192 $ is the maximal number of meaning subvertices, $ {\mathcal{L}}_{\mathrm{max}}=65031 $ is the maximal number of leaves, $ {\mathcal{C}}_{\mathrm{min}}=1/35 $ is the minimal commonness, and $ {\mathcal{C}}_{\mathrm{max}}=6863.6 $ is the maximal commonness.
Below, we summarize seven information content formulas whose performance for the analysis of creativity in design review conversations has been tested previously (Georgiev and Georgiev, Reference Georgiev and Georgiev2018).
Information content by Blanchard et al. (Reference Blanchard, Harzallah, Kuntz, Ghallab, Spyropoulos, Fakotakis and Avouris2008):
Information content by Meng et al. (Reference Meng, Gu and Zhou2012):
Information content by Sánchez et al. (Reference Sánchez, Batet and Isern2011):
Information content by Sánchez and Batet (Reference Sánchez and Batet2012):
Information content by Seco et al. (Reference Seco, Veale and Hayes2004):
Information content by Yuan et al. (Reference Yuan, Yu and Wang2013):
Information content by Zhou et al. (Reference Zhou, Wang and Gu2008a):
Semantic similarity for word pairs
For a semantic network composed of n word vertices, the average semantic similarity for all pairs of vertices could be determined with (n 2 − n)/2 searches in WordNet. Different definitions of semantic similarity between a pair of distinct words w 1 and w 2 rely on the lowest common subsumer of w 1 and w 2, the fraction of common subsumers, or the shortest path distance between w 1 and w 2 (Georgiev and Georgiev, Reference Georgiev and Georgiev2018).
The lowest common subsumer $ \mathcal{K}\left({w}_1,{w}_2\right) $ of a pair of distinct words $ {w}_1 $ and $ {w}_2 $ in the graph $ G=M\cup \mathcal{J}\left[R(W),\left\{{w}_1,{w}_2\right\}\right] $ is the deepest meaning vertex in the taxonomy among all vertices i whose sum $ \mathcal{D}\left[G,i,{w}_1\right]+\mathcal{D}\left[G,i,{w}_2\right] $ is minimal. If several common subsumers of w 1 and w 2 are at the same depth in the WordNet 3.1 taxonomy, the meaning vertex with lowest entry number is considered to be the unique $ \mathcal{K}\left({w}_1,{w}_2\right) $ . The depth $ \mathcal{T}\left[\mathcal{K}\left({w}_1,{w}_2\right)\right] $ of the lowest common subsumer $ \mathcal{K}\left({w}_1,{w}_2\right) $ is determined solely within the subgraph M.
The shortest path distance $ \mathcal{D}\left({w}_1,{w}_2\right) $ between a pair of distinct words w 1 and w 2 in the graph $ U(M)\cup \mathcal{J}\left[U(W),\left\{{w}_1,{w}_2\right\}\right] $ is the number of edges on the shortest path with subtracted two-edge contribution outside the subgraph U(M).
Path-based similarity measures
Semantic similarity by Al-Mubaid and Nguyen (Reference Al-Mubaid and Nguyen2006):
Semantic similarity by Leacock and Chodorow (Reference Leacock, Chodorow and Fellbaum1998):
Semantic similarity by Li et al. (Reference Li, Bandar and McLean2003):
Semantic similarity by Rada et al. (Reference Rada, Mili, Bicknell and Blettner1989):
Semantic similarity by Wu and Palmer (Reference Wu and Palmer1994):
Subsumer-based similarity measures
Subsumer-based similarity measures reduce to path-based ones only for monosemous words, whereas for polysemous words they produce distinct results.
Semantic similarity by Jaccard (Reference Jaccard1912):
Semantic similarity by Braun-Blanquet (Reference Braun-Blanquet1932):
Semantic similarity by Dice (Reference Dice1945):
Semantic similarity by Otsuka (Reference Otsuka1936) and Ochiai (Reference Ochiai1957):
Semantic similarity by Kulczyński (Reference Kulczyński1927):
Semantic similarity by Simpson (Reference Simpson1960):
Information content-based similarity measures
The following five information content-based similarity formulas could take as an input any of the seven information content formulas to give a total of 35 different information content-based similarity measures whose performance for the analysis of creativity in design review conversations has been tested previously (Georgiev and Georgiev, Reference Georgiev and Georgiev2018).
Semantic similarity by Jiang and Conrath (Reference Jiang and Conrath1997):
Semantic similarity by Lin (Reference Lin1998):
Semantic similarity by Meng et al. (Reference Meng, Huang and Gu2014):
Semantic similarity by Resnik (Reference Resnik1995):
Semantic similarity by Zhou et al. (Reference Zhou, Wang and Gu2008b) is a weighted average of k× path-based Leacock–Chodorow similarity and (1 − k)× information content-based Jiang–Conrath similarity. Usually, the two weights are set to be equal k = 1 – k = $ \frac{1}{2} $ .
Correlation between subjective and objective evaluation of word similarity
Cluster analysis based on RG-65 dataset
The development of many semantic similarity measures based on the hypernym–hyponym hierarchy in WordNet was done by their authors using as a testbed the RG-65 dataset containing 65 noun–noun pairs, whose semantic similarity is evaluated from subjective reports collected from human subjects (Rubenstein and Goodenough, Reference Rubenstein and Goodenough1965). Consequently, we have also used the RG-65 dataset to assess the degree of correlation between subjective human evaluation of word similarity and all semantic similarity measures reported in this work. Pearson correlation analysis shows that subsumer-based similarity measures are least correlated with subjective human evaluation (average r = 0.67, P < 0.001), followed by path-based similarity measures (average r = 0.82, P < 0.001), and information content-based similarity measures (average r = 0.84, P < 0.001). Consequent hierarchical clustering segregates subsumer-based similarity measures into a single small cluster that is least correlated to human evaluation, but mixes path-based and information content-based similarity measures into another large cluster (Figure 5). Although any of the available path-based or information content-based similarity measures could be used for exploration of divergent or convergent thinking in conversational transcripts, previous experimental study of creative cognition by design students in real-world educational setting has found that information content-based similarity measures exhibit highest statistical power to differentiate between successful and unsuccessful ideas (Georgiev and Georgiev, Reference Georgiev and Georgiev2018). Here, we have identified the information content formula by Sánchez–Batet (6), as the one that exhibits highest correlation with human evaluation of word similarity, with an average r = 0.85 across all five information content-based semantic similarity formulas. Furthermore, the semantic similarity formula by Lin (22) has the highest r = 0.85 when used with Sánchez–Batet formula (6) among all purely information content-based semantic similarity formulas, which are computationally fast to execute in real-time application. Thus, the combination of formulas (6) and (22) ensures best correlation with human evaluation of similarity and optimizes computational speed for engineering applications.
Aim 3: Back-testing dynamic semantic measures for success of design ideas
Construction of semantic networks from design conversations
The concurrent verbalization may not access all cognitive processes involved in design thinking. Nevertheless, the language provides an output channel of information that could be used to monitor the ongoing design process and an input channel of information that could be used by the designer to incorporate external feedback on the designed product. Thus, it is desirable to assess whether verbalization and language could be useful in aiding the design process, even though they do not exhaust everything that goes on in the designer’s mind. Next, we illustrate with a concrete empirical example how design review conversations could be analyzed using moving time window and demonstrate the utility of dynamic semantic measures to differentiate between successful and unsuccessful ideas.
Distinguishing successful ideas from unsuccessful ideas
For numerical analysis, we have employed the experimental dataset of complete transcripts with design review conversations provided as a part of the 10th design thinking research symposium (DTRS 10) including two subsets with students majoring in industrial design: a subset with seven junior students and a subset with five graduate students (Adams and Siddiqui, Reference Adams and Siddiqui2013, Reference Adams and Siddiqui2015). Each design project consisted of five stages: (1) task review, (2) concept review, (3) client review, (4) concept reduction review, and (5) final presentation. For each project, the students developed several possible design solutions (Figure 6), from which only the best one was selected to appear in the final presentation.
This final design solution, which was selected after consultation with the client, is considered to be the successful idea because it has won the competition with other design solutions (ideas) that were not successful in regard to appearing in the final presentation. Here, our main motivation is to distinguish the best design solution from all the rest. Segmentation of the transcripts with regard to successful and unsuccessful ideas was performed based on the videos and the presentation slides. Overall, there were 12 successful ideas, and 41 unsuccessful ideas in the DTRS 10 dataset (Georgiev and Georgiev, Reference Georgiev and Georgiev2018).
Construction of moving time window
To test for possible relationship between attributes of divergent/convergent thinking and the success of design ideas, we have split the design conversation transcripts for each idea and have constructed dynamic semantic networks with a moving time window that contains six distinct nouns. This approach is different from bag-of-words, because it does not keep multiplicity of nouns. The first time window is constructed by removing repeated nouns in the conversation until there are collected six distinct nouns. Average (mean) information content of the six nouns in each time window was computed using the Sánchez–Batet formula (6), whereas average (mean) semantic similarity of the 15 noun pairs was computed with the Lin formula (22). These particular formulas have been also found to differentiate well between successful and unsuccessful ideas when the design conversations are split into three equal parts (Georgiev and Georgiev, Reference Georgiev and Georgiev2018).
The temporal duration for the development of each idea was normalized within the unit interval, so that time = 0 indicates the start and time = 1 indicates the end of the design review conversation that pertains to the idea under consideration. The shortest time step corresponds to the appearance of the next noun in the conversation. If the next noun is already repeated in the preceding time window, the time step is added but there is no dynamic change of the semantic network. Alternatively, if the next noun is not repeated in the preceding time window, then both the time step is added and there is a dynamic change of the semantic network.
Because different students had generated different numbers of unsuccessful ideas, to ensure equal weight of each project we have first averaged the dynamic trajectories per student and only then we have averaged the trajectories of unsuccessful ideas across students. Despite that the resulting average trajectories of semantic measures appear to be noisy, it is possible to extract smooth trendlines using linear best fit that minimizes the sum of squared residuals (Figure 7).
Statistical differences in the rates of change of semantic measures
To test whether the linear trendlines constructed with moving time window are able to extract faithfully the rates of change (slopes of the trendlines) for information content and semantic similarity reported in previous study where the design review conversations were divided into three equal parts (Georgiev and Georgiev, Reference Georgiev and Georgiev2018), we have employed one tailed paired t-tests. The statistical analysis indeed confirmed that the information content exhibited positive rate of change (k s = 0.024) for successful ideas, whereas it exhibited negative rate of change (k u = –0.012) for unsuccessful ideas, (t = 2.24, P = 0.024). Opposite dynamics was observed for semantic similarity, which exhibited negative rate of change (k s = –0.08) for successful ideas and positive rate of change (k u = 0.01) for unsuccessful ideas, (t = –2.05, P = 0.032). Computationally, divergent thinking is identified by negative rate of change of semantic similarity in time, whereas convergent thinking is identified by positive rate of change of semantic similarity in time. Thus, consistent with psychological theories linking creativity with divergent thinking (Guilford, Reference Guilford1957; Hudson, Reference Hudson1974; Runco, Reference Runco2004, Reference Runco2007; Runco and Pritzker, Reference Runco and Pritzker2020), we have found that successful ideas exhibit computational attributes of divergent thinking such as increasing information content (Figure 7a) and decreasing semantic similarity (Figure 7b) during the development of ideas in time, whereas unsuccessful ideas exhibit computational attributes of convergent thinking such as decreasing information content (Figure 7a) and increasing semantic similarity (Figure 7b). These results were obtained in retrospective fashion through analysis of human curated transcripts, which eliminated machine errors in speech-to-text conversion and also performed post-processing of nouns such as conversion of plural to singular and omission of nouns that are absent from WordNet 3.1. However, the numerical plots establish as a proof of principle that conversation analysis could be employed in real time and the trendlines for semantic measures could be provided to the designer as a future forecast of whether the design product is going to be successful.
Discussion
WordNet captures faithfully the distinction between words and meanings
Progress on difficult scientific problems usually requires the development and adoption of new research tools for investigation (Laudan, Reference Laudan1978; Marx, Reference Marx2013; Glocker et al., Reference Glocker, Musolesi, Richens and Uhler2021). Here, our goal was to lay the foundations of an objective methodology for approaching the problem of human creativity. While the idea of using WordNet’s hypernymy for the investigation of analogical concepts is not new (Geum and Park, Reference Geum and Park2016), here we have scrutinized the possible implementation of dynamic semantic networks based on WordNet 3.1 as a tool for the exploration of creative thinking through analysis of verbal data obtained concurrently with the act of problem solving. The graph theoretic representation of WordNet 3.1 as a composition of two directed subgraphs, respectively for words and meanings, is computationally powerful and neuroscientifically well-tailored to capture the two anatomically distinct language-related brain cortical areas specialized for functional processing of words and meanings (Georgiev and Georgiev, Reference Georgiev and Georgiev2018;Georgiev et al., Reference Georgiev, Georgieva, Gong, Nanjappan and Georgiev2021). This is to be contrasted with the prevalent natural language processing approaches whose main goal is to analyze the meanings extracted from the verbal data, while viewing the words only as labels for the intended meanings that need to be disambiguated by a special preprocessing step of the transcribed texts. By keeping both words and meanings, the presented approach captures more faithfully the complexity of human thinking and provides an inroad to virtually imaged concepts that were not verbalized but provide links in the WordNet 3.1 intrinsic hierarchy (Yamamoto et al., Reference Yamamoto, Goka, Yusof, Taura, Nagai, Bergendahl, Grimheden, Leifer, Skogstad and Lindemann2009; Georgiev et al., Reference Georgiev, Nagai and Taura2010).
Advantages of dynamic semantic networks
Having thoroughly discussed the theoretical and applied aspects of dynamic semantic networks, we could summarize their main advantages as follows:
-
(1) reproducible objective analysis of verbal data,
-
(2) extraction of both verbalized and virtually imaged concepts in creative problem solving,
-
(3) minimal confounding injection of interpretation at the stage of data analysis due to dual use of words and meanings,
-
(4) minimal disturbance of spontaneous creative cognition due to reliance on verbalization of naturally occurring inner monologue,
-
(5) explicit acknowledgment of educational status and language proficiency of test subjects, and
-
(6) possibility for real-time computer-assisted audio-visual feedback of the constructed dynamic semantic network for enhancement of human creativity.
Limitations of dynamic semantic networks
The main limitations of dynamic semantic networks include:
-
(1) lower performance with certain types of creativity that rely on sensual inner stream of consciousness composed of visual images or sounds instead of words (e.g. painting of art or composing of music),
-
(2) possibly inadequate testing of individuals with low educational status, low language-proficiency or neurological deficits leading to aphasia, and
-
(3) lack of direct access to neural processes that remain outside of the contents of individual conscious experiences (Georgiev, Reference Georgiev2017, Reference Georgiev2020a, Reference Georgiev2020b).
These cons provide an effective definition of the domain of applicability of semantic networks. For supporting creative cognition, inference methodologies based on semantic distance can be employed (Sarica et al., Reference Sarica, Song, Luo and Wood2021). For studying creative cognition outside of the domain of individual conscious experiences, dynamic semantic networks could be complemented with mind-reading technologies that rely on reconstruction of mental images (e.g. visual images) from recorded electrical brain activity (Horikawa and Kamitani, Reference Horikawa and Kamitani2017; Roelfsema et al., Reference Roelfsema, Denys and Klink2018).
Outlook for future work
The range of creative activities that could benefit from dynamic analysis with semantic networks is quite extensive and includes much of problem solving in science, technology, engineering, and mathematics. Economically most important forms of creativity related to design and innovation of cutting-edge products, equipment, or services, is performed by highly trained, well-educated professionals with excellent language proficiency. Therefore, their creative performance is subject to verbalization, modeling, and improvement with dynamic semantic networks, which substantiates the need of future wider adoption of semantic networks in theoretical and applied cognitive science. This can be connected with an interdisciplinary approach to design thinking. Computer systems endowed with general artificial intelligence may dynamically monitor semantic measures of verbalized inner monologue, e.g. if the designer thinks aloud, and provide real-time feedback on creative problem solving. Human creativity could be then enhanced through suggestions that lead to divergence of semantic similarity of the developed solutions.
Conclusions
This work sought to develop a complete workflow for real-time application of dynamic semantic networks for monitoring cognitive processes during creative design, using specific measures on these networks that correlate with human evaluation. To achieve that objective, we back-tested the actual performance of the developed workflow evaluating ideas generated in design review conversations from an established dataset. This testing involved construction of semantic networks from design conversations, distinguishing successful ideas from unsuccessful ones, and construction of moving time window. The results demonstrate statistical differences in the rate of change of semantic measures for successful ideas and unsuccessful ideas. Overall, successful ideas exhibit computational attributes pertaining to divergent thinking, while unsuccessful ideas exhibit attributes of convergent thinking. This is seen as a proof of principle that dynamic analysis of conversations can be employed in real-time as a future forecast of the success of ideas and design products. This workflow allows for objective analysis of verbal data in design while preserving the spontaneity of creative cognition. This opens up the possibility of real-time AI tools that analyze and enhance human creativity as it occurs during the design process.
Competing interest
None declared.
Data availability statement
WordNet 3.1 is available online at: https://wordnet.princeton.edu/. The RG-65 dataset (Rubenstein and Goodenough, Reference Rubenstein and Goodenough1965) for human judgements of word similarity is available online at: https://doi.org/10.1145/365628.365657. The authors have signed Data-Use Agreements to Dr. Robin Adams (Purdue University) for accessing the Purdue DTRS Design Review Conversations Database, thereby agreeing not to reveal personal identifiers and not to create any commercial products.