Impact Statement
By providing predictive tools to polymer engineers that assist in predicting morphological features from easily knowable (or measurable) input parameters, the need to perform time-consuming and costly experiments to obtain desired morphological behaviors in such complex systems is reduced. This could reduce the overall complexity of the research and development (R&D) process for a variety of industries. Applying data-driven approaches to analyzing simulation data may also help to identify trends and features that are important but might not otherwise be detected by humans.
1. Introduction
Multicomponent polymer systems are of industrial interest in a variety of applications such as high-performance plastics (Nauman and He, Reference Nauman and He1994), membrane systems (Ulbricht, Reference Ulbricht2006; Yang et al., Reference Yang, Xie, Hou, Cheetham, Chen and Darling2018), nanoparticle and nano-colloidal systems (Lee et al., Reference Lee, Sosa, Liu, Prud’homme and Priestley2017; Li et al., Reference Li, Panagiotopoulos and Nikoubashman2017), and drug delivery (Lao et al., Reference Lao, Venkatraman and Peppas2008; Inguva et al., Reference Inguva, Ooi, Desai and Heng2015). One of the key features considered during the research and development (R&D) and/or manufacturing process is the morphology of the polymeric particles/blends formed. The morphology can profoundly impact the final product’s performance and usability; as for many applications, there is an optimal morphology that is desired. Understanding the relationship between polymer properties and their resultant blend morphology will therefore help guide product development from synthesis to manufacturing steps.
Computational methods provide an excellent tool for modeling various phenomena across various scales in polymeric systems (Gooneie et al., Reference Gooneie, Schuschnigg, Holzer, Gooneie, Schuschnigg and Holzer2017). Modeling techniques can be applied to understand and evaluate a variety of thermodynamic and transport properties such as polymer-blend miscibility. They can also be used in engineering and manufacturing applications to understand how morphological structures form or how materials respond to processing conditions. On a molecular/atomic scale, techniques such as molecular dynamics have been widely used in polymer systems to evaluate properties such as the miscibility and interactions of polymers in a blend or with other species (Prathab et al., Reference Prathab, Subramanian and Aminabhavi2007; Luo and Jiang, Reference Luo and Jiang2010), diffusion coefficients and transport characteristics (Pavel and Shanks, Reference Pavel and Shanks2005), composite elasticity (Han and Elliott, Reference Han and Elliott2007), and nanoparticle morphology (Li et al., Reference Li, Panagiotopoulos and Nikoubashman2017). Recent, nonequilibrium molecular dynamics simulation studies have also considered the effect of shear on the morphology of anisotropic nanoparticles (Bianchi et al., Reference Bianchi, Panagiotopoulos and Nikoubashman2015; Delacruz-Araujo et al., Reference Delacruz-Araujo, Beltran-Villegas, Larson and Córdova-Figueroa2016) such as Janus nanoparticles which are an interesting case within the possibility space of multicomponent polymer systems.
Continuum-based techniques are typically applied at length and timescales orders of magnitude larger than discrete approaches. Phase-field models, such as the Cahn–Hilliard equation (Cahn and Hilliard, Reference Cahn and Hilliard1958), are useful in capturing the dynamic behavior of structures and morphologies in heterogenous systems. The Cahn–Hilliard model accounts for various thermodynamic driving forces for morphology evolution, such as homogenous free energy and interfacial energy, and can also be adapted to further account for other relevant transport phenomena such as convection (Wodo and Ganapathysubramanian, Reference Wodo and Ganapathysubramanian2012). Previous continuum-scale simulations of multicomponent polymer systems have focused “uphill” diffusion, as described by the Cahn–Hilliard equation. Examples of previous applications include systems of two polymers and one solvent (denoted polymer–polymer–solvent [PPS]) (Alfarraj and Nauman, Reference Alfarraj and Nauman2007; Shang et al., Reference Shang, Fang, Wei, Barry, Mead and Kazmer2011), or ternary polymer systems (Nauman and He, Reference Nauman and He1994; Alfarraj and Nauman, Reference Alfarraj and Nauman2007). Studies that have considered the influence of convective transport in multicomponent systems have only evaluated a single polymer species precipitating out of solution (Zhou and Powell, Reference Zhou and Powell2006; Tree et al., Reference Tree, Delaney, Ceniceros, Iwama and Fredrickson2017). Consequently, there is a knowledge gap in the understanding of how systems containing more than one polymer, that is, polymer-polymer-solvent (PPS) and polymer-polymer-polymer (PPP) systems, behave in the presence of convective mass transport.
Machine learning (ML) has increasingly found use in physical simulations and computational modeling by complementing or replacing traditional modeling approaches. ML is noted in having strength in pattern recognition and data mining (Brunton et al., Reference Brunton, Noack and Koumoutsakos2020), which correspondingly enables it to be used for many tasks relevant to physical simulations. Previous studies have applied ML to develop data-driven surrogate/reduced order models (ROMs) to improve the speed of computations and results generation (Peherstorfer et al., Reference Peherstorfer, Kramer and Willcox2017; Janet et al., Reference Janet, Chan and Kulik2018; San and Maulik, Reference San and Maulik2018). ML has also been used to improve the accuracy of physical simulations by enabling the development of data-driven closures (Duraisamy et al., Reference Duraisamy, Iaccarino and Xiao2019) and models (Chmiela et al., Reference Chmiela, Tkatchenko, Sauceda, Poltavsky, Schütt and Müller2017), which can capture more data than traditional first principles or empirical models. Within the material sciences, ML has similarly found increasing use in a variety of cases ranging from the prediction of macroscopic self-assembled structures using molecular properties (Inokuchi et al., Reference Inokuchi, Li, Morohoshi and Arai2018) to the prediction of novel permanent magnets (Möller et al., Reference Möller, Körner, Krugel, Urban and Elsässer2018) and in the optimization of alloy properties (Ward et al., Reference Ward, O’Keeffe, Stevick, Jelbert, Aykol and Wolverton2018).
In polymer science specifically, ML has been used to optimize polymer-gel screening for injection wells (Aldhaheri et al., Reference Aldhaheri, Wei, Bai and Alsaba2017) and solar cells (Jørgensen et al., Reference Jørgensen, Mesta, Shil, García Lastra, Jacobsen, Thygesen and Schmidt2018), to improve polymeric interfacial compatibilization (Meenakshisundaram et al., Reference Meenakshisundaram, Hung, Patra and Simmons2017), and to classify and predict the physical features of polymer systems. For instance, supervised feed-forward neural networks have been used to recognize configurations produced from Monte Carlo simulations of polymer models, distinguishing between differently ordered states (Wei et al., Reference Wei, Melko and Chen2017). Self-folding mechanisms of polymer composite systems have also been modeled (Guo et al., Reference Guo, Li and Zhou2013). There is however a knowledge gap in the use of ML as a tool for classifying and predicting polymer-blend morphology. ML techniques are well suited to this end as they are able to learn complex features from the data, which facilitates pattern recognition and dimensionality reduction (Cai et al., Reference Cai, Luo, Wang and Yang2018).
Previous work in the broader material science field has typically considered only one aspect of the pipeline such as employing dimensionality reduction to identify important features that contribute to the outer structure of nanoparticles (López-Donaire et al., Reference López-Donaire, Sussman, Fernández-Gutiérrez, Méndez-Vilas, Ratner, Vázquez-Lasa and San Román2012) or applying supervised ML for classification of new carbon black samples (Fernandez Martinez et al., Reference Fernandez Martinez, Iturrondobeitia, Ibarretxe and Guraya2017). The present study hence applies a ML workflow, comprising the use of dimensionality reduction techniques with a clustering algorithm, to separate morphological data into clusters of distinct morphologies. Pattern prediction and design-space mapping are investigated via classification, using Gaussian process classification (GPC) techniques, in the low-dimensional transformed feature space. The present analysis is restricted to PPP systems and does not consider the effects of convective mass transfer. The value of the workflow developed in this study is twofold: (a) the approach is generalizable to additional engineering fields beyond polymer science; and (b) polymer blends can be studied more expediently as regions of interest (input parameters/morphology) can be first identified which allows resources (computational/experimental) to be focused.
2. Theory and Methodology
We address the problem set where physics-based simulation outputs need to be predicted from known input parameters: a workflow for integrating physics-based simulations and ML clustering is outlined in Figure 1.
2.1. Physics-based Cahn–Hilliard system
The Cahn–Hilliard equation is well suited for modeling polymer-blend precipitation at continuum length and timescales. Three different polymer species $ a,b,c $ are tracked; however, the following derivation is kept as general as possible to demonstrate the applicability of the model to $ n $ component mixtures. We use a modified Cahn–Hilliard system based on the work of (Petrishcheva and Abart, Reference Petrishcheva and Abart2012), which has the advantage of being able to handle mixtures where the components may have orders of magnitude difference in diffusivities. The ability of the model to handle components with large differences in diffusivity is important in modeling systems with polymers and solvents or polymers of significantly different chain lengths (Alfarraj and Nauman, Reference Alfarraj and Nauman2007).
The model used in the present work is also easier to implement than the earlier method of Alfarraj and Nauman (Reference Alfarraj and Nauman2007) for handling components with large difference in diffusivity. Alfarraj and Nauman (Reference Alfarraj and Nauman2007) introduced a “proportional flux method” within a finite difference scheme to ensure that the sum of fluxes into a point is zero. The proportional flux method introduces two issues: (a) the method itself is an heuristic solution without a robust theoretical foundation (Nauman and Savoca, Reference Nauman and Savoca2001) and (b) many modern partial differential equation (PDE) solvers, e.g. FEniCS (Logg et al., Reference Logg, Mardal and Wells2012) and FiPy (Guyer et al., Reference Guyer, Wheeler and Warren2009), are primarily declarative, and ad hoc adjustments to standard discretizations are difficult to implement.
The Cahn–Hilliard equation, as previously mentioned, models uphill diffusion where the driving force is gradients in the chemical potential $ \mu $ rather than concentration. Correspondingly, the flux $ {\boldsymbol{j}}_i $ of species $ i $ can be represented as
where each $ {L}_{ij} $ is the species mobility coefficient and $ \boldsymbol{L} $ is the square symmetric. The following constraints are imposed on the flux expression due to the Onsanger reciprocal relations and that the total flux into a point is zero (Petrishcheva and Abart, Reference Petrishcheva and Abart2012):
Correspondingly as per Petrishcheva and Abart (Reference Petrishcheva and Abart2012), $ {\boldsymbol{j}}_i $ can be expressed in terms of differences in chemical potentials,
As the Gibbs energy functional is scaled by $ RT $ , $ {L}_{ij} $ can be expressed with the following relationship:
where $ {x}_i $ and $ {x}_j $ are the mole fractions of species $ i $ and $ j $ , respectively. The mobility coefficients $ {L}_{ij} $ are composition dependent, but the $ {D}_{ij} $ effective diffusion coefficients can be constant.
To obtain the relevant transport equation for each species, we apply the continuity equation,
where $ t $ is the time. To determine an expression for the chemical potential, a generalized Landau–Ginzburg free energy functional for $ N $ components, $ {G}_{\mathrm{system}} $ , which accounts for inhomogeneity in the system is first considered (Cahn and Hilliard, Reference Cahn and Hilliard1958; Nauman and Balsara, Reference Nauman and Balsara1989),
where $ g $ is the homogenous free energy contribution, and $ {\kappa}_i $ and $ {\kappa}_{ij} $ are the self- and cross-gradient energy parameters, respectively. To evaluate whether the simulation has reached an equilibrium state, the Gibbs free energy was determined at each time step as per Equation (6). For polymeric systems, the homogenous free energy is well represented by the Flory–Huggins equation,
where $ {n}_i $ is the polymer chain length and $ {\chi}_{ij} $ is the Flory–Huggins binary interaction parameter. The generalized chemical potential, applicable for inhomogenous systems, for each species $ i $ can be expressed as the variational derivative of the Gibbs energy functional (Nauman and Balsara, Reference Nauman and Balsara1989; Cogswell, Reference Cogswell2010),
We replace $ {x}_i $ with $ a,b,c $ to represent the mole fractions of the three species of interest: A, B, and C, respectively. We can write the following equations for the differences in chemical potentials:
The gradient energy parameters for the PPP system (Nauman and He, Reference Nauman and He1994) can be evaluated as follows. We specifically consider the case of all polymer species having the same radius of gyration $ {R}_{\mathrm{G}} $ and diffusivity,
The compositional dependence of $ {\kappa}_i $ and $ {\kappa}_{ij} $ is neglected following an approach commonly used by similar simulation studies (Nauman and He, Reference Nauman and He1994; Zhou and Powell, Reference Zhou and Powell2006; Alfarraj and Nauman, Reference Alfarraj and Nauman2007). This also simplifies the computations.
Combining the expressions for the chemical potential Equations (9)–(11), species flux Equations (3)–(4), and the continuity Equation (5), we arrive at the following transport equations tracking species A and B:
Species C is obtained by using a material balance constraint,
For symmetric PPP systems, where all species diffusivities can be assumed equal, the equation system given by Equations (15)–(17) reduces to that considered by Nauman and He (Reference Nauman and He1994).
2.1.1 Scaling
We introduce the following scalings:
where $ {d}_{\mathrm{p}} $ is the characteristic length scale. The chemical potential and Gibbs energy functional are scaled by $ RT $ (denoted from here on as $ {\tilde{\mu}}_i $ and $ \tilde{g} $ , respectively). We consider the case of a symmetric PPP system i.e. all the polymer species have the same chain length and diffusivity, thus resulting in the following equation system:
Equations (20)–(24) model the polymer-blend demixing dynamics and form the final set of equations to be solved.
2.1.2 Numerical implementation
Numerical solution of the Cahn–Hilliard equation is challenging due to the fourth-order derivative: hence, the equation is typically treated as a set of coupled second-order PDEs (Jokisaari et al., Reference Jokisaari, Voorhees, Guyer, Warren and Heinonen2017) as shown in the equation system of Equations (20)–(24). The system was reformulated to variational form and solved with an open-source finite-element solver, FEniCS (Logg et al., Reference Logg, Mardal and Wells2012).
An unstructured square mesh of domain-length 40 dimensionless units was generated with a resolution of 80 cells along each domain boundary. Periodic boundary conditions were applied on the left and right boundaries and Neumann conditions were applied to the top and bottom domains. Unknown variables were treated implicitly and a backward Euler method was applied for time discretization. PPP cases were simulated for a duration of $ \tilde{t}=400 $ with a time step of $ \Delta \tilde{t}=0.02 $ , corresponding to a physical duration of $ t=16\;\mathrm{s} $ . Physical parameters were set to $ {R}_{\mathrm{G}}=200\times {10}^{-10}\;\mathrm{m} $ and $ {d}_{\mathrm{p}}={R}_{\mathrm{G}} $ , $ {D}_{\mathrm{AB}}={10}^{-11}\;{\mathrm{m}}^2{\mathrm{s}}^{-1} $ . Values of $ {\chi}_{ij} $ simulated ranged from 0.003 to 0.009 via automated batch scripting (Di Tommaso et al., Reference Di Tommaso, Chatzou, Floden, Barja, Palumbo and Notredame2017). In total, 1,140 simulation runs were performed, representing a total of five independent input parameters. The entire simulation process took ~80,000 core-hours. Benchmarking and model validation are discussed in the supplementary material.
2.2. Machine learning
2.2.1 Methodology and workflow
The ML workflow consisted of three steps: (a) dimensionality reduction, (b) clustering, and (c) supervised learning. Dimensionality reduction is required to address the curse of dimensionality (Kriegel et al., Reference Kriegel, Kröger and Zimek2009), which can make clustering of high-dimensional data, such as raw simulation images, prohibitively expensive. Clustering on the low-dimensional processed data was used to identify morphologies. Supervised ML, using GPC, was then used to determine a relationship between the physical input parameters and the resultant morphology.
2.2.2 Dimensionality reduction
Each simulation generates a high-resolution color image of the physical morphology, where RGB image channels form a proxy for species molar fraction (i.e., subject to the material balance of Equation (17)). Prior to dimensionality reduction, each image was preprocessed to $ 200\times 200 $ pixel resolution. The dimensionality reduction itself was applied using three candidate techniques: (a) principal component analysis (PCA), (b) t-distributed stochastic neighbor embedding (t-SNE), and (c) autoencoder compression (Wang et al., Reference Wang, Yao and Zhao2016). The three techniques used are representative of the three broad categories of techniques for dimensionality reduction available: (a) matrix decomposition or linear techniques, (b) manifold learning or nonlinear techniques, and (c) artificial neutral network (ANN)-based techniques. PCA and t-SNE were implemented using scikit-learn (Pedregosa et al., Reference Pedregosa, Weiss and Brucher2011), while the autoencoder was implemented using Keras (Chollet et al., Reference Chollet2018).
For PCA and t-SNE pipelines, the set of species concentration fields was extracted as three grayscale images ( $ 200\times 200 $ ); image arrays were then flattened and concatenated into a one-dimensional array of length 120,000 ( $ 3\times 200\times 200 $ ) representing each simulation result. For the autoencoder, the preprocessed color images were used as direct inputs.
Principal component analysis (PCA). For a set of input arrays, the $ q $ principal components (PCs) form the orthonormal axes onto which the retained variance under projection is maximal. The PCA pipeline within scikit-learn was undertaken using the probabilistic model of Tipping and Bishop (Reference Tipping and Bishop1999): the number of retained PCs, $ q $ , was varied between 0 and 100, corresponding to the dimensionality of the embedding.
t-distributed stochastic neighbor embedding (t-SNE). Nonlinear dimensionality reduction was undertaken in scikit-learn using the t-SNE technique (Maaten and Hinton, Reference Maaten and Hinton2008; Wattenberg et al., Reference Wattenberg, Viégas and Johnson2016). The number of embedding dimensions was varied from 2 to 10, while the perplexity was varied from 5 to 50.
Autoencoder compression. An autoencoder (Lecun et al., Reference Lecun, Bottou, Bengio and Haffner1998) is a specific type of ANN that consists of two sections: (a) an encoder that compresses high-dimensional data to low-dimensional “bottleneck” representation and (b) a decoder that recovers the original data from the encoded data (Hinton, Reference Hinton2006). Autoencoders are highly suitable for dimensionality reduction, with performance often exceeding alternative techniques such as PCA (Hinton, Reference Hinton2006; Wang et al., Reference Wang, Yao and Zhao2016). Autoencoders are ideal for image analysis applications (Chen et al., Reference Chen, Shi, Zhang, Wu and Guizani2017); moreover, the hidden-layer nodes at the ANN bottleneck can be exploited for downstream clustering pipelines.
Autoencoder architectures can be labeled using a convention $ T $ – $ N $ , where $ T $ denotes the layer type and $ N $ is the number of layers. Layer types explored presently include (a) densely connected (denoted “Dense”) and (b) convolutional (denoted “Conv”). The simplest autoencoder architectures consist of an input layer and an output layer with one or more Dense layers in a stacked fashion: Dense–1 has a single dense encoder layer which also serves as the bottleneck, while Dense–2 and Dense–3 contain additional layers. More complex architectures, such as LeNet–5 (Lecun et al., Reference Lecun, Bottou, Bengio and Haffner1998), apply a combination of Dense and Conv layers.
All candidate autoencoders were trained on the full preprocess image data set, enabling autoencoder compression to be implemented into the same workflow, Figure 1, as PCA and t-SNE dimensionality reduction. Due to the small data set size, the conventional split into training and validation sets (Petscharnig et al., Reference Petscharnig, Lux and Chatzichristofis2017; Chen and Huang, Reference Chen and Huang2019) for evaluating clustering accuracy was not undertaken. A summary of all explored autoencoder types and hyperparameters is given in the supplementary material. Hyperparameter optimization for the dense autoencoders was performed using Talos Autonomio (2019) (fractional random search over 10–15% of the grid; 10 epochs), while convolutional autoencoders were tuned manually.
2.2.3 Clustering
To implement a purely unsupervised learning pipeline, the clustering method and/or use of an appropriate metric, heuristic, or algorithm should be able to estimate the optimal number of clusters. Scikit-learn implements a variety of clustering algorithms, such as affinity propagation, spectral clustering, hierarchical clustering, and $ k $ -means, which can be used to this end.
The popular clustering algorithm $ k $ -means as implemented in scikit-learn was used to carry out clustering as a means of measuring “sign of life.” Sign-of-life in this case refers to there being initially positive results from a naive application of a clustering algorithm which then justifies further exploration. To this end, $ k $ -means was found to be as effective as any of the clustering algorithms outlined above and was used as the default clustering algorithm for all the embeddings generated by the various dimensionality reduction techniques.
The $ k $ -means algorithm works by clustering the datapoints into $ k $ groups by minimizing the within-cluster sum of squares (WCSS) (Yuan and Yang, Reference Yuan and Yang2019). The algorithm requires an initialization method (set to “k-means++”) and a predetermined number of clusters, $ k $ . Evaluating an appropriate number for $ k $ is one of the main challenges when using $ k $ -means and the elbow method was used. The elbow method involves running $ k $ -means for a range of $ k $ values and calculating the distortion or the total sum of square errors (Yuan and Yang, Reference Yuan and Yang2019). Ideally, when the number of clusters nears the real number of clusters, there is an inflexion in the distortion, showing as an “elbow” in a plot of distortions vs. $ k $ .
For the clustering of the t-SNE embedding of the Conv–3/4/5 autoencoder bottleneck values, $ k $ -means was ill-suited as it is unable to effectively identify clusters where the cluster shape is non-globular. In this case, affinity propagation was used (Frey and Dueck, Reference Frey and Dueck2007). Affinity propagation does not require the declaration of the number of clusters to perform clustering, but rather it performs clustering based on passing messages between the datapoints which represents the fitness of one sample to exemplify the other until a set of exemplars are identified, representing the final number of clusters (Frey and Dueck, Reference Frey and Dueck2007). The implementation of affinity propagation in scikit-learn was used and it has two main hyperparameters: the preference which refers to the strength of a datapoint to be an exemplar and the damping ratio which was set to .9 for stability.
2.2.4 Supervised learning
A scikit-learn GPC model was trained on the initial concentrations ( $ {a}_0 $ , $ {b}_0 $ ) and corresponding cluster label. A test set (20% of the data) was used to check if the classifier returned the correct cluster, given the specific parameters. Radial basis functions (RBFs) with a length scale of $ 1.0 $ were selected for the Gaussian process kernel. The data augmentation process is described in the results and discussion section. To perform the prediction, the initial species concentrations ( $ {a}_0 $ , $ {b}_0 $ ) were varied within a specific range ( $ {a}_0\in \left[0.1,0.8\right] $ , $ {b}_0\in \left[0.1,0.45\right] $ ) and passed into the trained GPC to evaluate the morphology maps as shown in Figure 10.
3. Results and Discussion
3.1. Polymer demixing simulation results
The numerical stability of Cahn–Hilliard solution methods is a known issue, especially when using the Flory–Huggins free-energy function at higher $ {\chi}_{ij} $ values (Brunswick et al., Reference Brunswick, Cavanaugh, Mathur, Russo and Nauman1998). Three possible simulation states were identified as shown in Table 1: simstate demonstrating the impact of numerical issues on the solution of the physical model, either the simulation would diverge prematurely due to numerical instability (State 3a), or even though the input parameters were selected such that the physical system is in the chemical spinodal, no demixing would occur (State 2). The present data set of 629 images was restricted to samples from States 1 to 3b. A representation of how the Gibbs energy evolved with time for each of the three cases is shown in Figure 2.
3.2. Dimensionality reduction and clustering
The effectiveness of each dimensionality reduction and clustering technique is summarized in Figure 3: techniques were assessed.
3.2.1 Principal component analysis (PCA)
The image set could not be partitioned into distinct embedded-space clusters via PCA. As shown in Figure 4, the number of clusters evaluated by $ k $ -means consistently remained between 5 and 6 independent of the number of retained PCs, demonstrating that the application of PCA here is ineffective. PCA was not capable of learning distinguishing features of the system as a number of clusters, each with a distinct morphology, did not emerge.
Figure 5 shows how the clustering took place in two-dimensional (2D) space: each cluster contained a variety of morphologies and was therefore not distinct. Each cluster was, however, consistent in terms of the predominant continuous phase. This is evidenced in the sample images for each cluster. For example, cluster 0 contains globular dispersed-phase patterns; however, a mixture of core–shell (one component encapsulates the other), single-component (one component is present), and miscible (both components are mixed) patterns are present in the globular structure. A detailed description of the various observed morphologies and exemplar images is available in the supplementary material. Similar clustering behavior was observed when retaining higher numbers of PCs.
3.2.2 t-distributed stochastic neighbor embedding (t-SNE)
For t-SNE dimensionality reduction techniques, the variance in the optimal number of clusters was observed to increase with the number of embedding dimensions as shown in Figure 6. For almost all values of perplexity, the optimal number of clusters was 5 for two embedding dimensions and 7–8 for three embedding dimensions. A maximum in the number of clusters was observed for the combination of 7 embedding dimensions and perplexity 10.
Clustering in three or more embedding dimensions resulted in outlier clusters which contained only 1–2 datapoints (not shown). Clusters of outlying datapoints in the embedded space distorted the $ k $ -means process. The optimal number of clusters is also more sensitive to the perplexity values, which is reflected in the increasing variance in the number of optimal clusters. A visual inspection of the images from each cluster for sample cases revealed poorer clustering performance compared to PCA or t-SNE in two embedding dimensions with clusters having more variation in the types of morphologies and the species of the continuous phase present.
Use of two embedding dimensions resulted in more consistent clustering performance, with the optimal number of clusters mostly remaining at 5: a representative example can be seen in Figure 7. The dimensionality reduction and clustering performance was comparable to PCA; while there was significant mis-clustering within each cluster, each cluster was generally consistent with regards to the continuous-phase species present. Ultimately, both PCA and t-SNE techniques were unable to capture an adequate number of features to describe the diverse morphologies arising from ternary-polymer blends, prompting the exploration autoencoders as an alternative unsupervised ML workflow.
3.2.3 Autoencoders
The performance of each autoencoder architecture tested is shown in table: autoencoder together with reconstructed images and representative loss and accuracy values. Increasing the size and depth of the Dense autoencoders, which increases the number of tuneable parameters, did not improve the loss and accuracy values, which plateau at ~ 0.01 and ~ 0.6, respectively, for the optimal set of hyperparameters. The reconstructed image quality remains consistently poor. Autoencoders with Dense layers only generated poor embeddings of the images and were not further explored.
Conv autoencoder architectures performed significantly better than Dense architectures as evidenced by the reconstructed images in Table 2. The accuracy and reconstructed image quality increased with the number of bottleneck embedding dimensions. A dimensionality of $ \gtrsim 500 $ is necessary to reconstruct both the morphology and continuous-phase species identity. A set of reconstructed images and loss/accuracy results for Conv–3, 4, 5 autoencoders is presented in the supplementary material for a range of bottleneck filter values. The addition of a Dense layer at the bottleneck of the Conv–Dense autoencoders increased the number of required parameters ( $ \ge {10}^9 $ ): hence, only Conv–4, 5 Dense–2, 1 autoencoders were tested due to their tractable memory requirements. Conv–Dense performance was comparable to Dense–1, 2, 3 architectures and was not further explored.
Clustering via $ k $ -means was performed directly on the Conv autoencoder bottleneck as shown in Figure 8. Performance was found to be comparable with the other dimensionality reduction techniques tested (PCA and t-SNE): clustering on the full embedding yielded a similar number of clusters as the previous approaches.
As the dimensionality of the bottleneck was comparatively high (~ 100–~ 1,000), stacking of additional dimensionality reduction techniques to further reduce the data dimensionality was performed (Maaten and Hinton, Reference Maaten and Hinton2008). Applying PCA and retaining two PCs was not effective, yielding clustering performance comparable to applying PCA or t-SNE directly. Further dimensionality reduction was applied to the bottleneck values using t-SNE with a final two-dimensional embedding. The resulting datapoint distribution shown in Figure 9a almost exactly coincides with the composition $ {a}_0 $ and $ {b}_0 $ : by following the “S” shape curve from the bottom and the value of $ {a}_0 $ increases. This result was consistent across the various Conv autoencoders and the results from the combination of the Conv–4 (4 Filters) bottleneck and t-SNE is presented in Figure 9.
Clustering via $ k $ -means techniques is ill-suited due to the shape of the embedded data shown in Figure 9b,c: the clusters cannot be ellipsoidal/spherical. Two alternative options were tested as shown in Figure 3: (a) manual clustering following the composition trend and (b) affinity propagation.
Affinity propagation (with optimal preference $ -250 $ ) resulted in 24 clusters each of comparable size (Figure 9b), while manual clustering following the composition trend yielded 21 clusters of varying sizes (Figure 9c). The clustering carried out by affinity propagation and manual clustering following the composition trend had ~ 30% of ~ 17.6% of the datapoints within each cluster not corresponding to the majority morphology of that cluster. Furthermore, both clustering techniques resulted in non-unique clusters as different clusters would have the same majority morphology: this reduces the inherent value of each cluster as a bin to capture a distinct morphological class, adversely impacting the usability of the cluster labels for the subsequent supervised ML tasks. The majority pattern of each cluster for both manual clustering following the composition trend and affinity propagation clustering from the Conv–4(4 Filters)—t-SNE dimensionality reduction is presented in the supplementary material together with a separate direct manual clustering of the high-resolution images based on the morphology.
We speculate that it may be possible to obtain further improvements in the clustering performance for this case by adopting classical image analysis and feature-engineering approaches. However, such approaches are beyond the scope of the present work as the premise of applying unsupervised ML is defeated when feature engineering is performed. Classical feature engineering and extraction identify defined features such as edges or shapes; hence, classical image analysis is conceptually similar to manual clustering. In addition, while the dimensionality reduction and clustering sections of the proposed workflow had limited success in the present study, the workflow is generalizable and can be implemented with minimal modification. However, performing feature engineering constrains the workflow as it requires the features to be re-engineered for each new problem.
3.3. Morphology prediction
The manual cluster identities obtained by performing manual clustering directly on the high-resolution images were used to train a prediction model. Even though affinity propagation clustering and manual clustering following the composition trend on the Conv–t-SNE embedding as outlined in Figure 3 was able to identify clusters with reasonable accuracy (Figure 9b), each cluster did not represent an intrinsic morphology. Therefore, the manual labels were deemed to be more appropriate for downstream supervised learning.
The simulation has a total of eight parameters: the composition which is controlled by $ {a}_0 $ , $ {b}_0 $ , the polymer chain lengths $ {N}_i $ for $ i=\mathrm{1,2,3} $ , and the binary interaction parameters $ {\chi}_{ij} $ for $ \left(i,j\right)=\left(1,2\right) $ , $ \left(1,3\right) $ , and $ \left(2,3\right) $ . Polymers of the same chain length, $ {N}_i=1,000 $ for all $ i $ , were considered in this study, which resulted in a total of five independent variables. For ease of visualization, 2D slices of the five-dimensional (5D) space are presented when we consider how the composition affects the morphology for different cases of interaction parameters. The entire image data set together with corresponding simulation parameters is available in the supplementary material.
The quantity of raw data for a given slice was approximately 25–50 datapoints. The low count was deemed inadequate for stable predictions (whereby the prediction quality becomes independent of the data quantity), and data augmentation was hence applied. Performing additional simulation runs was not considered due to computational limitations and anticipated numerical instability issues. Data augmentation was performed under the assumption that slight perturbations in $ {a}_0 $ and $ {b}_0 $ do not change the morphology of the polymer blend. The size of the data set was increased threefold by considering $ \left({a}_0\pm \varepsilon, {b}_0\pm \varepsilon \right) $ for $ \varepsilon \in \left\{0.002,0.005\right\} $ . The associated uncertainty introduced to the prediction was bounded by 5%. This comes about when considering the smallest datapoint $ {a}_0=0.1 $ , $ {b}_0=0.1 $ and the largest change of $ \pm 0.005 $ . The $ {\mathrm{\ell}}_2 $ norm changes by 5% in this case and is lower for all other datapoints.
As shown as Figure 10, the prediction process was able to yield maps whereby input values of $ {a}_0 $ and $ {b}_0 $ could be mapped to distinct morphological clusters using the cluster numbers from the manual clustering process. The test data have been overlaid onto each plot for reference. The prediction accuracy for the first case presented where $ {\chi}_{ij}={\chi}_{ik}={\chi}_{jk}=0.003 $ was found to be 100%. The second case where $ {\chi}_{ij}={\chi}_{jk}=0.006 $ and $ {\chi}_{ik}=0.003 $ had a prediction accuracy of 93%. The high performance of the GPC in mapping the regimes demonstrates the capability of supervised ML techniques for morphology prediction.
4. Conclusion
Physics-based simulations of ternary polymer demixing were implemented using Cahn–Hilliard theory and a numerical solver. Despite difficulties due to numerical instabilities, a comprehensive data set was generated that covers meaningful parameter ranges, including Flory–Huggins interaction parameters and molecular size.
The performance of conventional dimensionality reduction techniques (PCA and t-SNE) when clustering the simulation results into distinct categories was inadequate for use in downstream supervised learning tasks. Application of ML to the present simulation set remains a challenging task: the techniques struggled to identify unique polymer-blend features that are important for morphology characterization. It may well be possible to apply more sophisticated clustering techniques; the time and cost investments may significantly exceed those of direct manual labeling and yield comparatively poorer results. Supervised ML using GPC was used to predict the polymer-blend morphology to within ≥ 93% accuracy; the accuracy is anticipated to increase with the addition of further simulation training data.
The data set (included in the supplementary material) enables users to obtain reasonable first predictions of polymer-blend morphologies for polymers with comparable physical parameters, bypassing computationally expensive simulations: resources can hence be targeted at regions of interest in the physical parameter space. The present framework can be readily extended for ternary polymer blends with modified physical properties. Extension is also envisioned for entirely different systems, including polymer-polymer (PP) and PPS systems, and the coupling of Navier–Stokes models for prediction of shear on polymer-blend morphology.
Acknowledgment
We acknowledge Funding from the UK Research and Innovation, and Engineering and Physical Sciences Research Council through the PREdictive Modeling with QuantIfication of UncERtainty for MultiphasE Systems (PREMIERE) programme, grant number EP/T000414/1, the Alan Turing Institute AI for Science and Government programme. O.K.M. acknowledges the Royal Academy of Engineering Research Chair in Multiphase Fluid Dymamics. I.P. acknowledges funding from the Imperial College Research Fellowship scheme (ICRF).
Competing Interests
The authors declare no competing interests exist.
Author Contributions
Conceptualization: O.K.M. and L.R.M.; Methodology, L.R.M. and P.I.; I.P.; Software, L.R.M., I.P., and P.I.; Investigation, L.R.M., P.I., and M.H.; Supervision, O.K.M. and L.R.M.; Visualization, P.I. and M.H.; Writing-original draft, P.I. and M.H.; Writing-review & editing, P.I., L.R.M., and I.P. All authors approved the final submitted draft.
Data Availability Statement
The full data set used in this study along with the scripts and environment configuration files needed to run the simulations and machine learning tasks can be found in the following repository: https://github.com/ImperialCollegeLondon/polymer_blend_morphology
Supplementary Materials
To view supplementary material for this article, please visit http://dx.doi.org/10.1017/dce.2020.14.
Comments
No Comments have been published for this article.