1. Introduction
Design ideation is a creative stage in the design process where designers gather with open minds to produce various solution concepts to address design problems. This process aims for designers to extensively explore the design space, which is the space of all possible design concepts (Daly et al. Reference Daly, Seifert, Yilmaz and Gonzalez2016). The effectiveness of this process relies significantly on the knowledge, experience and creativity of human designers. However, these same factors can also limit their ability to thoroughly explore the design space. Various methods have been developed to aid designers in generating concepts during the ideation phase (Osborn Reference Osborn1957; Daly et al. Reference Daly, Christian, Yilmaz, Seifert and Gonzalez2011; Álvarez & Ritchey Reference Álvarez and Ritchey2015).
As large language models (LLMs) such as ChatGPT have evolved, several researchers have explored their potential to enhance the creativity of human designers (Filippi Reference Filippi2023; Girotra et al. Reference Girotra, Meincke, Terwiesch and Ulrich2023; Wang et al. Reference Wang, Zuo, Cai, Yin, Childs, Sun and Chen2023; Zhu & Luo Reference Zhu and Luo2023). These studies primarily focus on how LLMs can improve the efficiency of generating design concepts, while their impact on the diversity of generated concepts remains underexplored. We have previously conducted preliminary research on how ChatGPT can be prompted with a professional persona to increase the diversity of generated design concepts (Feng, Hélie & Panchal Reference Feng, Hélie and Panchal2024). Building on that work, this study addresses the following research question:
How can LLMs be prompted to maximize the diversity of design concepts generated during the design ideation process?
To answer this question, we formulate a central hypothesis and sub-hypotheses:
Central hypothesis:
For a design concept generation problem, an LLM provides more diverse design concepts when it is prompted with multiple professional personas using a sequential prompting strategy compared to (1) a parallel prompting strategy and (2) a collective prompting strategy.
Sub-hypotheses:
1a. Design concepts generated by LLMs operating with a virtual professional persona cover professional concepts that are semantically more aligned with concepts from the knowledge base of this persona.
1b. Topics in a professional knowledge base are semantically more aligned with topics within the same knowledge base, as compared to topics from other knowledge bases.
2. For some design concepts generated by an LLM prompted with some professional persona, a more diverse set of design concepts can be generated by prompting the LLM with a new professional persona to update these concepts.
3a. LLMs generate more diverse concepts when prompted with multiple personas using the parallel prompting strategy, compared to providing prompts without specifying any persona.
3b. Given multiple professional personas, LLMs generate more diverse design concepts when these personas are provided using the sequential prompting strategy, compared to the collective prompting strategy.
3c. Given multiple professional personas, LLMs generate more diverse design concepts when these personas are provided using the sequential prompting strategy, compared to the parallel prompting strategy.
The overview of the paper is as follows. First, a concise background of existing studies on design ideation, LLMs and design representations is presented. Second, the methods applied in this study are introduced, including (1) knowledge base construction, (2) design problem and professional persona selection and (3) prompting technique and strategy design. Third, each sub-hypothesis defined above is tested across seven design ideation problems using five different professional personas from various fields, and the results are discussed. Finally, the paper concludes by discussing the findings and their implications for engineering design.
2. Background
This section provides an overview of the study’s relevant topics and reviews the existing literature. The discussion covers existing design ideation techniques, LLMs and their application to design ideation problems and design representations.
2.1. Design ideation
2.1.1. Traditional design ideation techniques and limitations
Design ideation consists of generating, developing and communicating ideas where an “idea” refers to a fundamental unit of thought that can take visual, tangible or abstract forms (Jonson Reference Jonson2005). According to Tschimmel (Reference Tschimmel2012), there are usually two stages in a design activity: divergence and convergence. Designers can benefit from the diversity of ideas generated in the divergence stage to converge into more innovative design concepts. Different methods have been developed to support designers in generating solutions, such as brainstorming (Osborn Reference Osborn1957), morphological analysis (Álvarez & Ritchey Reference Álvarez and Ritchey2015) and design heuristics (Daly et al. Reference Daly, Christian, Yilmaz, Seifert and Gonzalez2011).
Previous work has shown that the effectiveness of these methods depends on the creativity of human designers and can be restricted by a range of factors. For example, cognitive biases affect designers’ behavior in information interpretation, decision-making and reasoning (Hewstone, Rubin & Willis Reference Hewstone, Rubin and Willis2003; Boysen & Vogel Reference Boysen and Vogel2009; Plews-Ogan et al. Reference Plews-Ogan, Bell, Townsend, Canterbury and Wilkes2020). As a result, designers tend to favor certain design ideas while dismissing others (Agyemang, Andreae & McComb Reference Agyemang, Andreae and McComb2023). Groupthink is another factor that can restrict human creativity in the design ideation process, which was introduced by Janis (Reference Janis1982) as “a group where loyalty requires each member to avoid raising controversial issues.” This is the combined result of unconscious preferences for minimizing cognitive effort and a subconscious inclination to avoid social resistance, which can lead to low-quality decisions and hinder the group’s ability to explore a broader range of alternatives (Fox Reference Fox2019; Akhmad, Chang & Deguchi Reference Akhmad, Chang and Deguchi2021).
2.1.2. Multidisciplinary collaboration in design ideation
Given the cognitive and social constraints in traditional ideation methods, multidisciplinary collaboration has emerged as a valuable approach to overcoming such limitations and enriching the design process. According to Adams et al. (Reference Adams, Chen, Davis, McKenna, McDonnell and Lloyd2009), multidisciplinary practice refers to the process of designers “joining together of disciplines to work on common problems and splitting apart when work is done.” Kleinsmann et al. (Reference Kleinsmann, Deken, Dong and Lauche2012) also noted that collaboration requires crossing knowledge boundaries, emphasizing synthesis of practices and shared understanding.
In order to effectively boost multidisciplinary collaboration in design, several methods and strategies have been developed to support the communication and integration of diverse knowledge throughout this process. Van Helden et al. (Reference Van Helden, Zandbergen, Specht and Gill2023) found that clearly defined rules for collaboration, division of labor and shared objectives significantly enhance collaborative effectiveness in student design teams. Several studies have also focused on how linguistic and cultural differences in communication among multidisciplinary design teams can influence team performance during collaborative design tasks (Jutraz & Zupancic Reference Jutraz and Zupancic2014; D’Souza & Dastmalchi Reference D’Souza, Dastmalchi, Bohemia, Sung and Lemon2017).
In general, design teams made up of members from different disciplines tend to generate more diverse and creative design solutions, as designers bring different perspectives to the problem-solving process (Agogué et al. Reference Agogué, Poirel, Pineau, Houdé and Cassotti2014; Gero, Yu & Wells Reference Gero, Yu and Wells2019).
2.2. Large language models (LLMs)
LLMs, such as BERT (Devlin et al. Reference Devlin, Chang, Lee and Toutanova2019), Llama (Touvron et al. Reference Touvron, Lavril, Izacard, Martinet, Lachaux, Lacroix, Rozière, Goyal, Hambro, Azhar, Rodriguez, Joulin, Grave and Lample2023), ChatGPT (Radford et al. Reference Radford, Narasimhan, Salimans and Sutskever2018) and Gemini (Anil et al. Reference Anil, Borgeaud and Wu2023), have been increasingly popular in recent years. These models produce language that closely mimics writing created by humans since they are trained on extensive textual datasets, which enables them to answer questions and carry out a wide range of linguistic tasks (Kasneci et al. Reference Kasneci, Sessler, Küchemann and Bannert2023). Higher performance in these tasks can be attained by applying fine-tuning approaches on smaller task-specific datasets (Wang et al. Reference Wang, Zuo, Cai, Yin, Childs, Sun and Chen2023). The state-of-the-art models outperform previous artificial intelligence (AI) models in general intelligence, effectively taking on novel and complex tasks such as machine vision, creative writing, coding and problem-solving in mathematics (Bubeck et al. Reference Bubeck, Chandrasekaran, Eldan, Gehrke, Horvitz, Kamar, Lee, Lee, Li, Lundberg, Nori, Palangi, Ribeiro and Zhang2023).
Among LLMs, the Generative Pre-training Transformer (GPT) has been iteratively improved from GPT-1 to GPT-5 (current at the time of this writing). Based on the transformer architecture, a GPT is trained using unsupervised learning methods on large text datasets. ChatGPT is a chatbot based on the GPT architecture. It is a potent tool with a wide range of applications across multiple areas (Liebrenz et al. Reference Liebrenz, Schleifer, Buadze, Bhugra and Smith2023; McGee Reference McGee2023; Thorp Reference Thorp2023; Wu et al. Reference Wu, Yin, Qi and Wang2023).
2.2.1. LLM-assisted design
To tackle the limitations of human designer-based design techniques introduced in Section 2.1, various studies tested the feasibility of using LLMs in design tasks. For example, Zhu & Luo (Reference Zhu and Luo2023) show that these models with certain customizations could generate verbal design concepts with a reasonable level of competence, while Wang et al. (Reference Wang, Zuo, Cai, Yin, Childs, Sun and Chen2023) propose a prompt-guided method inspired by the function–behavior–structure model to progressively generate design ideas.
With the rapid development of LLMs, recent research has focused on integrating these models into various aspects of the design process. First, multiple studies have tested the capacity of LLMs to handle information and knowledge, from extracting and comprehending technical information from engineering documentation (Doris et al. Reference Doris, Grandi, Tomich, Alam, Ataei, Cheong and Ahmed2024), to utilizing external knowledge to inform the design process (Han & Moghaddam Reference Han and Moghaddam2024; Li, Ko & Ameri Reference Li, Ko and Ameri2025). Second, multiple studies have focused on how LLMs can influence cognitive processes and education in design thinking (Agarwal, Jablokow & McComb Reference Agarwal, Jablokow and McComb2024; Jiang, Huang & Shen Reference Jiang, Huang and Shen2024; Zhang, Zhao & Haddad Reference Zhang, Zhao and Haddad2024). Generally, these studies demonstrate that prompting LLMs with different cognitive styles can influence the feasibility and innovativeness of generated solutions, thereby enhancing design fluency, particularly for novice designers. Finally, LLMs have also been utilized to automate or support key aspects of the detailed design process, such as material selection (Grandi et al. Reference Grandi, Jain, Groom, Cramer and McComb2025), requirements elicitation (Ataei et al. Reference Ataei, Cheong, Grandi, Wang, Morris and Tessier2024), computer-aided design (CAD) object generation (Li, Sun & Sha Reference Li, Sun and Sha2025), engineering requirements tasks (Norheim et al. Reference Norheim, Rebentisch, Xiao and de Weck2024) and additive manufacturing (Xie et al. Reference Xie, Hoskins, Rowe and Ju2025).
Researchers have also explored the performance of LLMs in the concept generation and ideation process. An early study by Ma et al. (Reference Ma, Grandi, McComb and Goucher-Lambert2023) assessed the proficiency of LLMs in generating conceptual designs and contrasted the design ideas produced with crowd-sourced design concepts. The authors found that while LLMs can make more feasible and valuable design ideas, they are less novel and diverse. Their subsequent study tested the impact of parameter tuning and prompt variations on the diversity of design concepts generated by LLMs (Ma et al. Reference Ma, Grandi, McComb and Goucher-Lambert2025). This study showed that there is no clear pattern between the parameters and concept diversity, and concept diversity is also highly responsive to the structure of the prompts provided. Chen et al. (Reference Chen, Xia, Jiang, Tan, Sun and Zhang2024) integrated LLMs with concept–knowledge (C-K) theory to support interdisciplinary knowledge retrieval and structured concept generation, demonstrating improved innovation and feasibility in design outcomes. Naghavi, Wang & Xu (Reference Naghavi, Wang and Xu2024) combined deep generative models with LLMs to reconstruct and generate porous metamaterial units, showing the potential of LLMs for advanced design representation and ideation. Given these results, introducing the concept of multidisciplinary collaboration into the LLM-assisted design process could potentially be beneficial in generating diverse and innovative design concepts.
2.2.2. Prompt engineering
For a language model, the prompt is the user-provided input that guides the model’s output. The study of prompt engineering aims to create the best possible prompt inputs so that LLMs can produce the necessary (accurate or novel) responses. To effectively execute the specific tasks discussed in this study, it is essential to design prompts for LLMs that consider existing prompt engineering techniques.
Amatriain (Reference Amatriain2024) provided several pointers and strategies for the prompt design process, such as “encouraging the model to be factual,” “explicitly ending the prompt instructions,” and “being forceful.” Based on these instructions, more sophisticated methods for creating a prompt or set of prompts have been tried and documented. Wang, Moazeni & Klabjan (Reference Wang, Moazeni and Klabjan2025) proposed a Bayesian optimal learning framework for automated prompt engineering in LLMs, using a knowledge–gradient policy to efficiently discover high-performing prompts under limited evaluation budgets. Cheng et al. (Reference Cheng, Dai, Liu, Yu, Lin, Liang, Tan, Narang, Chung and Le2024) proposed a model-adaptive prompt optimization (MAPO) approach that adapts different LLMs using model-friendly prompts to enhance their capabilities across various downstream tasks. Yuksekgonul et al. (Reference Yuksekgonul, Bianchi, Boen, Liu, Huang, Guestrin and Zou2024) introduced the TEXTGRAD framework, which enables automatic optimization of components in AI systems by backpropagating natural language feedback from LLMs based on a user-defined objective.
Specifically, various studies have examined how including personas in the prompts provided to LLMs can impact their performance in general tasks. These studies have led to different conclusions. Hu & Collier (Reference Hu and Collier2024) and Zheng et al. (Reference Zheng, Pei, Logeswaran, Lee and Jurgens2024) demonstrated that incorporating personas in prompts has minimal impact on the performance of LLMs in natural language processing (NLP)-related tasks. However, Luz de Araujo & Roth (Reference Luz de Araujo and Roth2025) found that LLMs prompted with personas show greater variability in their responses to both objective and subjective tasks. It is worth noting that these studies have primarily focused on the impact of LLMs prompted by personas in NLP and question-answering tasks, while their impact on design-related problems remains underexplored.
2.3. Design representations
Representing design concepts is crucial for the design process. In addition to providing a strategy for efficient and systematic analysis and comparison of various design concepts, it also assists in determining the efficacy of different ideation processes (Cash & Maier Reference Cash and Maier2021). Over the years, various frameworks and methodologies have been developed to represent design concepts, which are utilized in different domains for diverse objectives. First, a design can be represented as a semantic network based on design descriptions in textual form (Sarica, Han & Luo Reference Sarica, Han and Luo2023). This representation provides designers with insights into complex information in design concepts by utilizing an existing knowledge base to extract entities and relations from a textual design description. Furthermore, parametric design is an approach that defines a design concept using parameters that represent different attributes of the design concept, and the simple modification or optimization of each design is achieved through parameter changes (Monedero Reference Monedero2000). Parametric design is widely applied in areas such as CAD. Lastly, feature-based modeling attempts to illustrate how changes to each input feature impact the design as a whole by representing designs according to their features (Bhatt, Weller & Moura Reference Bhatt, Weller and Moura2020). This method works especially well for engineering, production and assembly activities.
3. Methods
The approach adopted in this study aims to test how an LLM can be prompted to increase the diversity of design concepts it generates. In this approach, an LLM is first prompted with multiple professional personas to generate a set of concepts for a specific design problem. Different strategies for the prompting process are tested in this study, which are discussed later in this section. In each strategy, the same LLM is also prompted to summarize the design concepts generated into multiple professional terms. These terms are embedded within a space of professional knowledge bases to test their alignment with existing professional topics, and the diversity of a set of terms is also analyzed to assess the diversity of design concepts generated.
The specific steps include constructing professional knowledge bases, selecting design problems and professional personas, and designing prompting techniques and strategies. These steps are discussed in further detail in the following sections.
3.1. Knowledge base construction
In this paper, the knowledge base associated with a domain refers to the set of primary topics with which a typical expert within that domain would be familiar. For example, a trained mechanical engineer would be familiar with topics such as solid mechanics, fluid mechanics, machine design and heat transfer. In contrast, a trained computer scientist would be more familiar with topics such as AI, algorithm structures and human–computer interactions. These higher-level topics can be further decomposed into more specific topics such as the first law of thermodynamics, entropy and thermodynamic cycles for the mechanical engineering (ME) knowledge base, or NLP, search algorithms and user experience design for the computer science (CS) knowledge base.
In this study, knowledge bases were built for the following five disciplines: ME, chemical engineering (CHE), CS, biology (BIOL) and psychology (PSY), which correspond to the five professional personas selected for the prompting process. For each knowledge base, the following two-stage procedure was adopted to collect the topics: (1) core course collection and (2) topic extraction and preprocessing.
In the first step, the key courses for each field of study were determined by reviewing the curriculum of the associated undergraduate program. The courses selected were directly relevant to the specific topics within each major, so foundational courses such as calculus and physics were excluded. The selected courses ensured that a wide range of professional concepts was represented within each domain. In total, 45 courses were gathered across the five domains.
Subsequently, a standard textbook was acquired for each core course, and the table of contents from each textbook was used for topic extraction. Notably, the themes presented in the table of contents were chosen because they were more concise than other sources for extracting course topics, such as the course description or learning outcomes. Next, the topics for each course were extracted from the table of contents. The raw text data for the topics were cleaned up after this process. Headers such as “Introduction,” “Summary,” “Exercises” and “Overview,” which have general or ambiguous definitions, were excluded. All of the topics were transformed to lowercase because the text embedding API from OpenAI (Radford et al. Reference Radford, Kim, Xu, Brockman, McLeavey and Sutskever2023) used in this study is case-sensitive, except for proper nouns and acronyms, such as “SI units” and “DNA.” The creation of the knowledge base for this study was completed after the data-cleaning procedures. This knowledge base, divided into five separate domain-specific knowledge bases, comprises a total of 4469 topics. Table 1 provides a summary of the knowledge base.
Table 1. Summary of the knowledge base across five professional domains

3.2 Design problems and selection of professional personas
A set of criteria was used to select the design problems that are tested in this study from the design research literature. To encourage design ideas from designers with diverse professional backgrounds, the problem should first encompass a range of professional fields. Second, the problem statement should not contain any particular design restrictions that would prevent the generation of creative design concepts. Third, every design issue needs to be summarized into a single sentence to fit into the prompt provided to an LLM using a predetermined format.
The design ideation problems tested in this study were modified from the study by Goucher-Lambert & Cagan (Reference Goucher-Lambert and Cagan2019) based on the criteria introduced above. These problems originated in earlier design literature and were all modified according to the same specifications for our investigation. An overview of the design ideation problems employed in this paper is provided in Table 2. As indicated in the previous section, five professional personas, namely mechanical engineer, chemical engineer, computer scientist, biologist and psychologist, were selected. These professional personas were chosen for two reasons: (a) they spanned a wide range of domains of study, from engineering to science and social science, and (b) they were in alignment with some undergraduate majors that are commonly offered in colleges.
Table 2. Design problems used in this study

3.3. Prompting techniques
3.3.1. Prompting with a single professional persona for design concept generation
This section introduces how LLMs were prompted with a professional persona to generate design concepts for a design problem. In this process, the design problem is provided to an LLM with a professional persona specified using the following prompt:
Prompt 1:
“I would like you to generate {num_solutions} different designs or solutions for {design_problem} using the knowledge base of a {profession}.
For each design concept, please provide a long paragraph of detailed technical description about how this design can be built and the specific professional concepts included in this design.”
When prompted to the LLM, “{num_solutions}” specifies the number of design concepts the LLM is expected to generate and “{design_problem}” is replaced by the problem string listed in Table 2 to specify the design problem. The “{profession}” part is replaced with one of the following strings to specify the professional persona LLM is asked to generate design concepts in: “mechanical engineer,” “chemical engineer,” “computer scientist,” “biologist” and “psychologist.”
With this prompt, a set of design concepts for the specified design problem is generated by the LLM with the specified professional persona.
3.3.2. Prompting with multiple professional personas for design concept generation
The following prompt was provided to an LLM to generate design concepts based on multiple professional personas through a single prompt:
Prompt 2:
“I would like you to generate {num_solutions} different designs or solutions for {design_problem} using the combined knowledge base of {professions}.
For each design concept, please provide a long paragraph of detailed technical description about how this design can be built and the specific professional concepts included in this design.”
This prompt is similar to the prompt defined for a single professional persona, with the only difference being that the “{professions}” part in the sentence above is replaced with the combined strings of the professional personas. The personas are separated using commas, such as “mechanical engineer, chemical engineer, computer scientist.” Using this prompt, a set of design concepts for the given design problem is generated by the LLM with the combination of specified personas.
3.3.3. Prompting with professional persona for design concept update
In the design ideation process, a new set of design concepts can be generated by updating an existing set of design concepts using the knowledge base of a new professional persona. This process is achieved by providing the following prompt to an LLM:
Prompt 3:
“Listed below are some design concepts for {design_problem}.
{design_solutions}
I would like you to generate {num_solutions} new designs or solutions for the same design problem by using the knowledge base of a {profession} to integrate or complete each of the design concepts generated above.
For each new design concept, please provide a long paragraph of detailed technical description about how this design can be built and the specific professional concepts included in this design.”
In this prompt, “{design_problem},” “{num_solutions}” and “{profession}” share the same meaning as in the first prompt. Also, the “{design_solutions}” part is replaced by the string of design concepts to be updated in this step. Using the prompt, a new set of design concepts is generated by applying the expertise of a new persona to update an existing set of design concepts.
3.3.4. Prompting for design concept summarization
The LLM-generated design concepts were summarized into a set of professional terms, and the alignments between these terms and topics from professional knowledge bases were evaluated to assess how professional knowledge in different fields is utilized in the LLM-generated design concepts. This process is achieved by providing the following prompt to the same LLM used for design concept generation:
Prompt 4:
“Listed below are {num_solutions} design concepts for {design_problem}:
{design_solutions}
I would like you to summarize each design concept into {num_terms} professional terms strictly in the format of the following example:
Name: design concept.
Professional Terms: term 1; term 2; term 3; …”
In this prompt, the “{num_solutions}” part specifies the number of design concepts to be summarized with this prompt; the “{design_problem}” part specifies the original design problem these design concepts correspond to; the “{design_solutions}” part is replaced by the string of design concepts to be summarized with this prompt; and the “{num_terms}” part specifies the number of professional terms each design concept is summarized into. With this prompt, the LLM is also instructed to formulate its output in the defined format, so that the terms it provides can be easily extracted later.
With this prompt, the design concepts are summarized into a set of professional terms that represent the professional knowledge involved.
3.4. Prompting strategies
This study adopts three different strategies to prompt LLMs with multiple professional personas for design concept generation: parallel prompting, collective prompting and sequential prompting. These correspond to the strategies defined in Hypothesis 3, and each strategy is implemented by applying the prompting techniques described above.
3.4.1. Parallel prompting
Parallel prompting refers to independently providing multiple prompts, each with a professional persona, to an LLM to generate design concepts for a design problem. As shown in Figure 1, the design problem is first presented independently to the LLM with different personas using Prompt 1 defined above, and multiple sets of design concepts are generated corresponding to the provided personas in this step. With this strategy, an LLM generates multiple sets of design concepts for a design ideation problem, each corresponding to a professional persona.

Figure 1. Parallel prompting strategy.
3.4.2. Collective prompting
In collective prompting, an LLM is provided with a single prompt that contains multiple professional personas for generating design concepts. This process begins with Prompt 2, as defined above, using the specified combination of personas. Only one set of design concepts is generated at this step. With this strategy, an LLM generates a single set of design concepts for a design ideation problem by combining the knowledge bases of different personas.
3.4.3. Sequential prompting
In sequential prompting, the LLM is provided with a sequence of prompts, each with a professional persona, to generate design concepts by iteratively updating existing design concepts. As shown in Figure 2, the LLM is first instructed to generate a set of design concepts using Prompt 1 with the first persona in the sequence. This set of design concepts is then provided to the same LLM with Prompt 2, where the LLM is prompted to generate a new set of design concepts for the same problem by updating the concepts provided using the knowledge base of the second persona in sequence. Subsequently, this update process is repeated using each of the remaining personas in the sequence, and a new set of design concepts is generated in each step by updating the concepts generated from the last step using the knowledge base of the persona specified in the current step. For a sequence of n professional personas, n sets of design concepts are generated in this process.

Figure 2. Sequential prompting strategy.
4. Experiments with design problems for the testing of hypotheses
This section presents experiments with selected design problems aimed at testing the hypotheses defined earlier in this article. Each hypothesis was tested using three different LLMs: ChatGPT (GPT-4o), Gemini-1.5 flash and Llama-3.1-8B-Instruct.
4.1. Evaluation of design concepts
Each design concept was represented by a set of related professional terms. The assessment of design concepts relied on the semantic analysis of associated terms, where semantic text analysis was achieved by numerically encoding each piece of textual data as a vector within an embedding space (Mikolov et al. Reference Mikolov, Chen, Corrado and Dean2013). Textual data were encoded into vectors using OpenAI’s “text-embedding-3-large” model, which maps text strings into a 3072-dimensional space. These semantic representations enabled the evaluation of similarity between two phrase groups and the analysis of the diversity of phrases within each group.
The semantic similarity of two phrases can be quantified by computing the cosine similarity of their vector representations in the embedding space (Rahutomo, Kitasuka & Aritsugi Reference Rahutomo, Kitasuka and Aritsugi2012). Table 3 provides an illustrative example of the cosine similarities for some phrase pairs in the embedding space of OpenAI’s “text-embedding-3-large” model. On this basis, the similarity between two groups of phrases can be measured by calculating the average pairwise cosine similarity between phrases in the first group and phrases in the second group.
Table 3. Cosine similarities for sample phrase pairs

The diversity of phrases within a group is calculated using the approach introduced by Ahmed & Fuge (Reference Ahmed and Fuge2018). This method involved constructing a symmetric similarity matrix L for a collection of phrases S. Each element
$ {L}_{i,j} $
in the matrix quantifies the similarity between two phrases i and j within S. A value of
$ {L}_{i,j}=1 $
indicates identical phrases, while
$ {L}_{i,j}=0 $
denotes complete dissimilarity. The overall diversity of the phrases in S is determined with this matrix using the following equation:
In this context, n denotes the number of phrases in set S. The approach assumes that the determinant of matrix L serves as an estimate of the extent of the high-dimensional space occupied by the vectors representing the phrases in S. A set with higher diversity among its phrases is expected to occupy a larger portion of this space, leading to a higher determinant value.
4.2. Experiments and evaluation metrics
This section provides a brief introduction to the experiments conducted to test each hypothesis in this study, as well as the evaluation of the results for each experiment.
Hypothesis 1a is on the alignment between design concepts generated by the LLMs with a specific persona and different professional knowledge bases. In contrast, Hypothesis 1b examines the difference between the semantic alignment of topics within the same knowledge base and that of topics across different knowledge bases. To test Hypothesis 1a, five sets of design concepts were generated for each design problem using the parallel prompting strategy with the five selected personas. A set of related professional terms then summarizes each set of concepts. The similarity between a set of terms and each professional knowledge base is measured using the group similarity metric introduced above. The testing of Hypothesis 1b includes two steps. For topics from the same knowledge base, semantic similarity is captured by the average intragroup pairwise cosine similarity within these topics. In contrast, semantic similarity between two different knowledge bases is represented using the group similarity metric.
Hypothesis 2 assesses whether prompting LLMs to update an existing set of design concepts with a professional persona can generate a more diverse set of design concepts. In this experiment, an LLM is first prompted with one persona to generate a set of design concepts, which are then revised under a different persona to produce a new set of concepts. Each set of concepts is then summarized into a set of professional terms, and the diversity of each set of terms is analyzed. Finally, the diversity scores of both sets of terms are compared to capture the change in design concept diversity through the update process.
Hypotheses 3a–3c are formulated to test the difference in the diversity of design concepts generated by LLMs when prompted using different strategies, and two strategies are compared explicitly in each hypothesis. In the experiment for each hypothesis, two sets of design concepts are generated using the specified strategies, and each set of concepts is summarized using related professional terms. The diversity scores of both sets of terms are then calculated and compared to demonstrate how the diversity of design concepts generated by the LLM differs when prompted with different strategies.
4.3. Sample outputs by LLM
4.3.1. Sample design concept generation and summarization
To illustrate the process of LLM-assisted design concept generation, we provide a sample design problem to ChatGPT (GPT-4o) using Prompt 1 with some professional personas to generate multiple sets of design concepts. Each set is then provided back to ChatGPT (GPT-4o) with Prompt 4 and summarized in professional terms. In this case, the design problem provided is “an innovative product to froth milk,” and Prompt 1 is modified accordingly as:
“I would like you to generate 5 different designs or solutions for an innovative product to froth milk using the knowledge base of a {profession}.
For each design concept, please provide a long paragraph of detailed technical description about how this design can be built and the specific professional concepts included in this design.”
The “{profession}” part in the prompt was replaced to specify the persona used in this concept generation process. Each design concept generated at this step was then summarized into 10 professional terms by modifying Prompt 4 in the following format:
“Listed below are 5 design concepts for an innovative product to froth milk:
{design_concepts}
I would like you to summarize each design concept into 10 professional terms strictly in the format of the following example:
Name: design concept.
Professional Terms: term 1; term 2; term 3; …”
The “{design_concepts}” part in the prompt was replaced by the design concepts generated by ChatGPT (GPT-4o) in the previous step. With these two prompts, five design concepts are generated for the selected design problem using a specified persona, and each design concept is summarized into five professional terms. For illustration, the output design concepts and summarized terms using the mechanical engineer and psychologist personas are provided in Tables 4 and 5, respectively.
Table 4. Design concepts and summarized terms generated for the selected problem using ChatGPT (GPT-4o) with the mechanical engineer persona

Table 5. Design concepts and summarized terms generated for the selected problem using ChatGPT (GPT-4o) with the psychologist persona

4.3.2. Sample design concepts generated by LLM using sequential prompting strategy
This section provides some sample design concepts provided by ChatGPT (GPT-4o) prompted using the sequential prompting strategy. Specifically, the design problem selected for the generation process is also “an innovative product to froth milk,” and the tested sequence is: “computer scientist
$ \to $
biologist
$ \to $
psychologist
$ \to $
mechanical engineer
$ \to $
chemical engineer.” The sequential prompting process includes five steps, where the LLM is provided with one persona in each step. The LLM generates five design concepts in each step, and these concepts are summarized into 25 professional terms, resulting in a total of 125 summarized terms across five design concepts. The generated design concepts for the selected problem using the selected sequence are provided in Table 6.
Table 6. Sample design concepts generated by ChatGPT (GPT-4o) using a selected sequence of personas for the design problem “an innovative product to froth milk”

According to Table 6, when prompted with the first persona in the sequence, which is “computer scientist,” ChatGPT (GPT-4o) generates design concepts that are related to the professional knowledge of a computer scientist, such as AI, IoT, robotics, augmented reality and voice control. For each of the subsequent steps, ChatGPT (GPT-4o) updates each existing design concept using the professional knowledge of the persona prompted in the current step. For example, updating the design concept “AI-Assisted Frother” using the persona “biologist” leads to “Bio-Inspired Frother,” and updating this concept using “psychologist,” “mechanical engineer” and “chemical engineer” personas, respectively, provide the following design concepts: “Cognitive AI Frother,” “Adaptive Gear-Controlled Cognitive AI Frother” and “Nano-Emulsion AI Enhanced Frother.” Generally speaking, when prompted with the sequential prompting strategy, ChatGPT (GPT-4o) is capable of generating new design concepts and updating existing design concepts by leveraging the professional knowledge of each persona in the sequence.
4.4. Hypothesis testing results
4.4.1. Results for Hypothesis 1a
Hypothesis 1a compares the semantic alignment between LLM-generated concepts and professional knowledge bases. The results for all three LLMs are shown in Table 7, where the results for each LLM are computed across all seven design problems. In each matrix, each row corresponds to the persona provided to the LLM in the concept generation process, while each column corresponds to the professional knowledge base with which the generated concepts are compared. It is observed that for a specific professional knowledge base, the concepts generated by an LLM with the profession corresponding to that knowledge base yield a higher average cosine similarity with the topics in this knowledge base, compared to concepts generated by an LLM with other professional personas. This suggests that when operating with a virtual professional persona, an LLM can generate design concepts that leverage more professional knowledge associated with this persona.
Table 7. Group similarity between LLM-generated concepts and professional knowledge bases (results for Hypothesis 1a)

Figures 3 and 4 show the sample results of this alignment between ChatGPT (GPT-4o)-generated design concepts for the design problem “an innovative product to froth milk” and the five professional knowledge bases. Specifically, each subfigure in Figure 3 shows the probability density of pairwise semantic alignments between professional terms summarized from ChatGPT-generated concepts with different professional personas and topics from a knowledge base, while each subfigure in Figure 4 shows the cumulative distribution of the alignments. In Figure 3, a distribution curve with a higher mean value represents a better alignment of the ChatGPT (GPT-4o) response and the knowledge base. In contrast, a curve with a lower mean represents a lower alignment. However, in Figure 4, a better alignment is represented by the curve that reaches its peak later than others.

Figure 3. Probability density functions of ChatGPT (GPT-4o)-generated sample design concepts for “an innovative product to froth milk,” compared with different professional knowledge bases.

Figure 4. Cumulative density functions of ChatGPT (GPT-4o)-generated sample design concepts for “an innovative product to froth milk,” compared with different professional knowledge bases.
According to Figures 3 and 4, for topics within a specific professional knowledge base, the design concepts generated by ChatGPT (GPT-4o) with the corresponding professional persona have a dominantly higher average similarity with these topics, as compared to concepts generated by ChatGPT (GPT-4o) with other personas. This shows that ChatGPT (GPT-4o), when prompted with a specific professional persona, is more capable of generating design concepts by leveraging the topics from the corresponding knowledge base than when prompted with other personas.
4.4.2. Results for Hypothesis 1b
Hypothesis 1b was tested by calculating the average intragroup pairwise cosine similarity within each knowledge base and the group similarity between two different knowledge bases. According to Table 8, values in diagonal cells are generally higher than values in off-diagonal cells. This reflects that topics within a professional knowledge base are semantically more aligned with topics from the same knowledge base, as compared to topics from a different knowledge base. Thus, we can infer from this result that topics from different knowledge bases are semantically diverse, which confirms Hypothesis 1b.
Table 8. Group similarity between different professional knowledge bases (results for Hypothesis 1b)

4.4.3. Results for Hypothesis 2
Hypothesis 2 evaluates how updating existing design concepts using an LLM with a professional persona would impact the diversity of design concepts. In this experiment, the LLM first generates five sets of design concepts for a design problem using the parallel prompting strategy with the five personas in this study, namely
$ {S}_i $
, where i refers to the persona used for concept generation. Next, each
$ {S}_i $
is updated using Prompt 3 with the other four personas, respectively, to generate four new sets of design concepts, denoted as
$ {S}_{i,j}\left(i\ne j\right) $
, where j refers to the persona used in the update process. Each of
$ {S}_i $
and
$ {S}_{i,j} $
is then summarized by LLM using Prompt 4 into 50 related professional terms, and a diversity score is calculated to reflect the semantic diversity of terms involved in each set, denoted as
$ {D}_i $
and
$ {D}_{i,j} $
. Finally, each score
$ {D}_{i,j} $
is compared to the corresponding
$ {D}_i $
to reflect the change in the diversity of LLM-generated design concepts throughout the update process. This change is represented as a percentage change in each cell of Table 9, where a positive entry denotes an increase in diversity, while a negative entry denotes a decrease.
Table 9. Experiment results for Hypothesis 2

Table 9 presents the results for all three LLMs, where each cell shows the percentage change between each
$ {D}_{i,j} $
and its corresponding
$ {D}_i $
. In this table, the cells with a positive value are highlighted in pink, indicating an increase in the diversity of design concepts. According to Table 9, updating existing design concepts using Gemini-1.5 flash with a persona would yield design concepts with increased diversity, while updating using ChatGPT (GPT-4o) or Llama-3.1-8B-Instruct would lead to no significant increase in the diversity of design concepts.
4.4.4. Results for Hypothesis 3a
Hypothesis 3a compares the diversity of design concepts generated by an LLM when it is (1) prompted with no professional persona and (2) prompted with multiple professional personas using parallel prompting. Specifically, the LLM is first prompted with no persona to generate 25 design concepts, and each concept is then summarized into five professional terms, providing a total of 125 terms. On the contrary, the same LLM is also prompted to generate five design concepts for each of the five personas in this study using the parallel prompting strategy. Each concept is summarized into five professional terms, providing a total of 25 design concepts and 125 terms. A comparison of diversity for both sets of terms generated by each LLM is listed in Figure 5. It is observed that for each LLM, the design concepts generated when prompted in parallel with multiple personas yield a better diversity score than when prompted with no persona.

Figure 5. Experiment results for Hypothesis 3a.
4.4.5. Results for Hypotheses 3b and 3c
Hypotheses 3b and 3c compare the diversity of design concepts generated by an LLM when prompted with collective prompting, parallel prompting and sequential prompting strategies, respectively. In this experiment, the LLM is prompted with the five personas to generate 25 design concepts using each strategy defined above, and each set of concepts is summarized into 125 professional terms. The diversity score of each set of terms is then computed and compared. Notably, since different sequences of personas can lead to various design concepts, five sequences are tested for the sequential prompting strategy. Both the results of all sequences and the sequence that leads to design concepts with the highest diversity score are recorded in each experiment. The sequences tested in this study are:
Sequence 1: ME
$ \to $
CHE
$ \to $
CS
$ \to $
BIOL
$ \to $
PSY
Sequence 2: CHE
$ \to $
CS
$ \to $
BIOL
$ \to $
PSY
$ \to $
ME
Sequence 3: CS
$ \to $
BIOL
$ \to $
PSY
$ \to $
ME
$ \to $
CHE
Sequence 4: BIOL
$ \to $
PSY
$ \to $
ME
$ \to $
CHE
$ \to $
CS
Sequence 5: PSY
$ \to $
ME
$ \to $
CHE
$ \to $
CS
$ \to $
BIOL
The results of all strategies are shown in Figure 6, where each subfigure represents the comparison using a different LLM. According to Figure 6, collective prompting generates design concepts with the lowest diversity among the three strategies. Moreover, for each LLM, parallel prompting overall provides design concepts with a level of diversity similar to that of concepts generated using the sequential prompting strategy with the “optimal” sequence of personas, where the “optimal” sequence is obtained by picking the sequence that leads to the most diverse design concepts out of all sequences tested.

Figure 6. Experiment results for Hypotheses 3b and 3c.
4.4.6. Further analysis of sequential prompting strategy
For the sequential prompting strategy, one key aspect to explore is the underlying mechanism by which it affects the diversity level of the design concepts generated. To investigate this, we conduct further analysis of the concepts generated during the sequential prompting process.
First, we aim to capture how the diversity of design concepts generated at each step changes with the introduction of a new persona. For the five sequences of personas and the same design problem defined in Section 4.4.5, we measure the diversity score of the design concepts generated at each step of the sequential prompting process. The results are shown in Figure 7. It is observed that for the selected design problem, the diversity score of the design concepts generated at each step of the sequential prompting process remains relatively stable, indicating that introducing a new persona at each step does not significantly impact the diversity level of the generated design concepts.

Figure 7. Diversity scores of design concepts generated by ChatGPT (GPT-4o) at each step of the sequential prompting process for the selected design problem.
We also investigate how the LLM leverages different professional knowledge bases throughout the sequential process, as reflected in the alignment between the design concepts generated at each step of the sequential prompting process, and each professional knowledge base. We use the five sequences of personas and the same design problem as in the previous experiment to measure the similarity (using the method introduced in Section 4.1) between the design concepts generated at each step and each professional knowledge base. The results are shown in Figure 8. This figure presents a comparison of all five sequences tested in this study, where each entry in the table represents the similarity between the design concepts generated at a specific step of the sequence corresponding to the current table and a professional knowledge base. Specifically, the boxed cells indicate the comparison between design concepts generated using a persona and the professional knowledge base corresponding to that persona. For example, Sequence 3 is CS
$ \to $
BIOL
$ \to $
PSY
$ \to $
ME
$ \to $
CHE. At Step 1 in Sequence 3, the boxed cell represents the comparison between the design concepts generated and the professional knowledge base of CS, the first persona in this sequence.

Figure 8. Group similarity between LLM-generated design concepts at each step of the sequential prompting process and professional knowledge bases.
As shown in Figure 8, at each step of the sequential prompting process, the LLM generates design concepts that are more similar to the professional knowledge base of the persona provided at the current step, as compared to the professional knowledge base of other personas. For example, for Sequence 3, the design concepts generated at the first step are more aligned with the professional knowledge base of CS, compared to any of the subsequent steps. At Step 2, they align more with BIOL, at Step 3 with PSY, and so on. Furthermore, the results also show that for any of the later steps in the sequential prompting process, the design concepts generated are not significantly aligned with the professional knowledge base of previously used personas in the sequence. For example, the third persona used in Sequence 3 is psychologist (PSY), with computer scientist (CS) and biologist (BIOL) being the first two personas. However, according to the results for Step 3 in the third table, the design concepts generated are more aligned with the professional knowledge base of PSY, but not significantly more aligned with the professional knowledge bases of CS and BIOL. These results together indicate that, when prompted with multiple personas using the sequential prompting strategy, LLM generates design concepts at each step using the professional knowledge base of the persona of the current step alone, instead of integrating the professional knowledge base of this persona with the design concepts generated at preceding steps of the sequence.
The experiment results from Section 4.4.5 show that LLMs prompted with sequential and parallel prompting strategies generate design concepts with similar diversity scores. On this basis, we also explore how the diversity of design concepts differs when the LLM is prompted using the sequential prompting strategy with different sequences of professional personas. We again use the sequential prompting strategy with the same five sequences of personas as in the previous experiment to prompt ChatGPT (GPT-4o) to generate design concepts for design Problems 1 and 5. The results are shown in Figure 9 and Table 10. Figure 9 shows the distribution of diversity scores of design concepts generated by ChatGPT (GPT-4o) using each sequence of personas for the selected design problems, while the numerical results for mean and standard deviation of each distribution are shown in Table 10.

Figure 9. Cumulative density functions of diversity scores of design concepts generated by ChatGPT (GPT-4o) using different sequences of personas for selected design problems.
Table 10. Mean and standard deviation of diversity scores of design concepts generated by ChatGPT (GPT-4o) using different sequences of personas for selected design problems

It is observed that for the selected design problems, the design concepts generated by ChatGPT (GPT-4o) share a similar level of diversity when prompted with the five tested sequences of personas. Specifically, Sequence 1 yields the most diverse design concepts for Problem 1, Sequence 2 yields the most diverse design concepts for Problem 5, and Sequence 5 yields the least diverse design concepts for both design problems.
5. Conclusion and future work
This paper provides insights into how the diversity of design concepts generated by language models can be increased by prompting LLMs with virtual professional personas. Specifically, multiple hypotheses are formulated to compare different strategies for prompting LLMs with professional personas to generate design concepts. The methods used to test the hypotheses include knowledge base construction, selection of design problems and professional personas, and prompt design. Among all the prompting strategies studied in this paper, the results reveal that LLMs generate more diverse design concepts when prompted with the parallel prompting strategy and the sequential prompting strategy (with the best sequence), as compared to when prompted with the collective prompting strategy. We also conduct further analysis on the sequential prompting strategy, and the results reveal that (1) design concepts generated by an LLM share a similar level of diversity when it is prompted with different sequences of personas; and (2) LLM, when prompted with multiple personas using the sequential prompting strategy, generates design concepts at each step mostly using the professional knowledge base of the persona of the current step alone.
Using the strategies studied in this paper, designers can address the design concept generation problems with professional topics from the combined knowledge base of multiple professional personas. By prompting LLMs using the two strategies defined above, the diversity of design concepts generated can be increased during the divergence stage of design, which is beneficial for designers to yield more innovative design concepts.
This article lays the groundwork for further exploration of the application of LLMs with professional personas in the design process. First, while this study evaluates five personas using multiple sample sequences within a sequential prompting strategy, future research could extend this approach to a broader range of professional personas. Additionally, we used phrases such as “mechanical engineer” to define a persona. In the future, richer descriptions could be developed by incorporating more detailed attributes – such as specific knowledge and expertise in a professional domain, and their associated cognitive styles – to create more nuanced personas. Finally, human studies could be conducted to investigate how these persona-based strategies influence the creativity and diversity of ideas generated by human designers during the ideation process.

