Probabilistic approach of a flow pattern map for horizontal, vertical, and inclined pipes

A way to predict two-phase liquid-gas flow patterns is presented for horizontal, vertical and inclined pipes. A set of experimental data (7702 points, distributed among 22 authors) and a set of synthetic data generated using OLGA Multiphase Toolkit v.7.3.3 (59 674 points) were gathered. A filtering process based on the experimental void fraction was proposed. Moreover, a classification of the pattern flows based on a supervised classification and a probabilistic flow pattern map is proposed based on a Bayesian approach using four pattern flows: Segregated Flow, Annular Flow, Intermittent Flow, and Bubble Flow. A new visualization technique for flow pattern maps is proposed to understand the transition zones among flow patterns and provide further information than the flow pattern map boundaries reported in the literature. Following the methodology proposed in this approach, probabilistic flow pattern maps are obtained for oil–water pipes. These maps were determined using an experimental dataset of 11 071 records distributed among 53 authors and a numerical filter with the water cut reported by OLGA Multiphase Toolkit v7.3.3.


Introduction
Liquid-gas multiphase flows are complex physical processes that depend on how the interface deforms, the flow direction, and the compressibility of one of the phases (in some cases). Given certain operating conditions such as pressure, temperature, liquid or gas velocity, pipe orientation, and fluid properties, several interfacial geometric configurations have been reported in two-phase flow systems. These configurations are commonly known as flow patterns or flow regimes. Some of the reported patterns for vertical flow are bubble flow, slug flow, churn flow, wispy-annular flow, and annular. Whereas for horizontal flow, the patterns include bubble flow, plug flow, stratified flow, wavy flow, slug flow, and annular [1,2]. Prediction of these patterns is a matter of concern for designers and operators considering their relation to pressure drop and heat transfer calculations [3][4][5][6][7].
Two-dimensional flow pattern maps serve to visualize the most likely liquid-gas flow pattern found in a section of constant diameter, inclination, fluid properties, and input volumetric gas and liquid flow rates. These maps are usually developed based on measurable parameters along the pipeline segment instead of dimensionless variables such as the Weber, Froude, or Lockhart Martinelli numbers [8,9]. For instance, Taitel and Dukler [10] proposed a map using superficial velocities; Baker [2] and Hewitt and Roberts [11] developed maps using mass velocities; and  and Cheng et al. [3,12] considered flow pattern maps based on mass fluxes. Depending on the information used to develop these maps, they can be classified into experimentally or mechanistic-based. The first is obtained from a significant number of experiments, whereas the second is built up from the examination of transition mechanisms using fundamental equations [1]. The commonly implemented flow pattern maps for vertical pipes are those reported by Govier et al. [13], Griffith and Wallis [14], Hewitt and Roberts [11], Golan and Stenning [15], Oshinowo and Charles [16], Taitel et al. [17], Spedding and Nguyen [18], Barnea et al. [19], Spisak [20], Ulbrich [21], and Dziubinski et al. [22]. For horizontal and near horizontal pipes, the maps of Baker [23], Taitel and Duker [10], Hashizume [24], and Steiner [25] are often quoted [26]. Finally, for inclined pipes, the maps of Shoham [9], Magrini [27] and Mukherjee [28] are commonly preferred.
Despite the diversity of experimental flow pattern maps, different authors have criticized the subjectivity of these maps due to identification techniques like visual inspection [29]. Therefore, numerical methods, which discretize and solve commonly accepted equations of flow dynamics, have also been considered to set up approaches like the twophase model [29]. Simulation tools like OLGA, LedaFlow, PeTra, and fluid dynamic simulators such as CFD codes using Eulerian formulations have been reported in the literature to simulate the behavior of these multiphase flows [30][31][32][33][34][35][36]. Nevertheless, these simulators are commonly designed for pipelines with large diameters and risers, so over-predictions of the void fraction may occur [37].
Although different flow pattern maps are available, there are still difficulties in defining common boundaries between flow regimes [29]. Usually, these boundaries differ significantly among maps reported, and wide transition areas can be obtained; even flow patterns have been suggested to follow a combination and not standalone regimes [38]. Therefore, efforts should focus on determining which is the predominant flow regime, and a probabilistic-based map can be considered in advance. In this direction, a few pieces of work using R134a, R410A, and air-water have been proposed [39][40][41][42][43]. These works evaluate the probability of a flow regime under some operating conditions and flow quality based on a sequence of experimental images. However, probabilistic flow pattern maps based on extensive experimental and synthetic databanks aiming to assess transition zones have not been developed yet. Accordingly, the present work aims to develop a probabilistic-based approach to identify the more likely flow regime under some operating conditions. To this end, experimental records from an extensive literature review and synthetic data from OLGA v.7.3.3 simulations were implemented. Oil and Gas Simulator-Schlumberger (OLGA) was selected because is one of the commonly implemented simulators in the Oil and Gas industry and its good agreement with experimental results [32,33]. This paper is organized as follows: Section 2 contains a review of flow pattern maps and their classification. Section 3 describes the experimental and synthetic datasets. Section 4 presents the flow pattern probabilistic-based approach. Section 5 summarizes the basic findings and presents suggestions for future work. Besides, the paper counts with three appendices associated with the procedure developed to estimate the posterior probability (Appendix A.1 in Supplementary Material), the probabilistic flow pattern maps with several inclinations (Appendix A.2 in Supplementary Material), and the pattern maps for liquidliquid flow following a parallel approach (Appendix A.3 in Supplementary Material).
2 Review of flow pattern map classification 2.1 Horizontal flow patterns According to authors like França and Lahey [44], Zhang et al. [45], and Fan [46], horizontal flow patterns can be classified as follows: stratified smooth flow, stratified wavy flow, elongated bubble flow or plug flow, slug flow, annular flow, and wavy annular flow. These flow patterns can be grouped based on the following considerations: (i) Intermittent flow patterns cover plug and slug flow, as in vertical cases, because they are composed of large bubbles (Taylor bubbles), which are followed by a series of smaller bubbles. The main difference between the plug and slug flow patterns lies in the shape of the elongated bubble and the turbulence generated behind it; (ii) like in vertical pipes, the bubbly flow pattern refers to small spherical bubbles; and (iii) segregated flow patterns, both smooth and wavy, exhibit a clear separation of the liquid and gas phases, which creates a distinct interface between them. For the wavy case, increasing the gas velocity destabilizes the interface and creates waves in the liquid surface [47,48]. This distinction, however, is not significant for the current work [49]. The flow patterns found in the literature can be grouped then as follows (Tab. 1): Segregated Flow, Intermittent Flow, and Bubbly Flow [49].
This classification discriminates the available flow regimes clearly, so it will be used to gather experimental information. Besides, these flow regimes are reported by OLGA; hence, a direct comparison between synthetic and experimental data becomes possible. A graphical scheme of these flow patterns is depicted in Figures 1a-1c.

Vertical flow patterns
Vertical two-phase flows are commonly classified as bubbly flow, slug flow, churn flow, wispy annular flow, and annular flow. This classification has been proposed by different author and it can be seen in Taitel and Dukler [10], Carey [50], Ghajar [8] and Falcone et al. [51]. Authors such as Kouba [52], Thomas [53], Omebere-Iyari and Azzopardi [54], and Rosa et al. [55] have proposed minor changes of the flow patterns aforementioned. Therefore, a grouping criteria similar to the reported by Inoue et al. [49] is considered.
The flow patterns are grouped based on the following considerations: (i) The dispersed bubbles (bubbly flow) covers isolated bubbles sparsely distributed over the pipe cross-section, as a cluster of bubbles known as discrete bubbles. Isolated bubbles refer to uniformly sized bubbles describing a straight path, which does not interact with each other. (ii) A cluster of bubbles refers to non-spherical bubbles with non-uniform size distribution. Factors such as slipping between phases, which were identified by Omebere-Iyari and Azzopardi [54], seem to make sense only for situations where the liquid flows at low velocities. This situation is not of practical importance; therefore, it is not considered in the current grouping criteria. (iii) Intermittent flow pattern encompasses slug flow, churn flow, and unstable churn flow. This grouping criterion is undertaken considering that authors who identify these patterns described consistently liquid pistons of considerable length (large and elongated bubbles) followed by smaller bubbles, which are usually spherical. The pattern "slug flow" corresponds to Taylor bubbles; these bubbles occupy much of the cross-section, followed by smaller spherical bubbles. The pattern churn flow corresponds to the destabilization of Taylor bubbles, followed by small, destabilized bubbles. Finally, the unstable churn flow pattern, identified by Rosa et al. [55], describes coalescence of the Taylor bubbles with the small bubbles. (iv) Kouba [52] and Rosa et al. [55] used semi-annular flow to describe the pattern that occurs between the unstable "churn" and the smooth annular pattern. Semi-annular flow is regarded as a degenerate form of annular flow with high waves in the liquid-gas interface [12,28]. For this application, this pattern is grouped as an annular flow.
Based on these considerations, the different flow patterns found in the literature can be grouped as (Tab. 2): Bubbly Flow, Intermittent Flow, and Annular Flow [49], graphically shown in Figures 1d-1f.
For inclined pipes, both the flow patterns presented in the horizontal and vertical orientations will be used.

Experimental dataset
The experimental dataset has 7702 records distributed among 22 authors as shown in Table 3. These authors performed measurements for different liquid-gas combinations (covering refrigerants to viscous oils), superficial velocities, pipe lengths, and diameters. However, the water-air system is the most frequently used combination with about 62% of all the gathered records. Table 3 also presents pipes orientation during the experiments, which include vertical, horizontal, and inclined pipes with upward and downward directions. As expected, a larger number of data points is reported for horizontal cases and a lower amount of records for vertical downward pipes (À90°). Figure 2 depicts other relevant parameters of the experimental dataset: the distribution for the diameter, L/D relation, and superficial velocities. Note that the pipe diameters follow commercial dimensions (i.e., 1 in., 2 in. and 2.5 in.), and the mean L/D relation is 455, which ensures flow development for most of the cases. Besides, the flow patterns were classified as Annular flow in 23%, Segregated flow in 21%, Bubbly flow in 14% and Intermittent flow in 42%.

Selection of flow pattern map axis
Non-slip void fraction is one of the top parameters used to characterize two-phase flows, i.e., to obtain two-phase relative velocity, and to predict flow pattern transitions [48,68]. Void fraction depends on different physical parameters (e.g., gas/liquid velocities and viscosities) and operational parameters (e.g., pipeline length and diameter) [69]. In this work, the void fraction is used to compare the experimental and synthetic records; however, this parameter is often ignored in experimental works [29]. Therefore, this approach focused on a flow pattern map in which the axis features better describe the void fraction.
For this purpose, gas/liquid superficial velocities were used to build the flow pattern map given to their relevance in the void fraction prediction. This pattern flow map is one of the most common coordinates in the literature, given its physical representation and experimental reproduction [54-57, 62, 63]. This selection is supported by a feature selection based on a predictor importance analysis, which characterizes the general effect of the experimental void fraction.

Synthetic dataset
The steady-state synthetic records were generated with OLGA Multiphase Toolkit v.7.3.3 obtaining 59 574 records with the main parameters depicted in Table 4. The effect of pipe inclination on the flow pattern was evaluated by considering 5°steps from À90°to 90°. The remaining required parameters, except for the gas and liquid superficial velocities, were randomly selected based on the ranges obtained in the experimental database. This procedure aims to perform a correct overlap between the experimental and synthetic data by setting the gas velocity from 1e-02 to 40 m/s and the liquid velocity from 1e-03 to 68 m/s. These ranges were chosen based on the available experimental data. Transitions between neighboring flow patterns were represented by a mesh rather than a boundary line. This mesh was refined with synthetic data in those locations where transition zones appeared on the available experimental dataset for every inclination, i.e., a higher number of records of two flow patterns.

Experimental data processing
A filtering process was used to select the experimental records within an acceptable error in the void fraction prediction. This acceptable error is determined based on a tolerable difference between experimental and synthetic predictions of the void fraction. However, 4407 experimental records did not report this parameter, and a direct classification is not possible; therefore, a supervised learning approach was used to classify those records as acceptable/non-acceptable for the construction of the flow pattern map. For this purpose, experimental records already classified as acceptable or non-acceptable were used as a training set.
For the classification process, two types of supervised classifiers were considered: (i) Support Vector Machines (SVM) and (ii) K-Nearest Neighbors (KNN). These techniques have also been used in multiphase flow data classification [70,71]. A SVM classifier maps a given set of binary labeled training data into a high dimensional feature space, and it separates the classes with a maximum margin hyperplane [72]. KNN is a classification based on a majority vote of its K neighbors [73]. Further details about the mathematical description of these classifiers are found in Duda et al. [74], Tarca et al. [70], and Zhang and Wang [71]. The main features of these processes are described below.
The SVM classifier was developed following the recommendations reported by Hsu et al. [75]. A Radial Basis Function (RBF) was implemented because, (i) this kernel can handle a nonlinear classification, (ii) it has fewer hyperparameters comparing with other kernel functions (e.g., polynomial), and (iii) it has fewer numerical difficulties. Moreover, sensibility analyses were implemented to determine the penalty (C) and the RBF parameters (c) to minimize the classification error. For this purpose, a Cross-Validation process with an exponentially growing sequence (i.e., C 2 (2 À1 , 2 0 ,. . ., 2 4 ) and c 2 (2 À2 , 2 À1 ,. . ., 2 3 )) were considered obtaining that C = 16 and c = 0.25 achieve the greater prediction rate. For the KNN classifier, the number of neighbors for the voting process was evaluated to obtain an accurate classification rate; therefore, a sensibility analysis was performed obtaining that a nineneighbors classification has the lowest misclassified rate. The two classifiers (i.e., SVM, and KNN) were implemented following a 10-fold Cross-Validation process to compare their rates of classification.
Three criteria were considered to determine the tolerable error: (i) acceptable records are included as much as possible; (ii) tolerable error should be as low as possible; and (iii) the performance of the classifier should be as good as possible. Based on these criteria, a performance classifier was proposed: where C Q represents the classifier quality [%], tol is the evaluated tolerable error [%], CP is the classifier performance [%] (related to the accurate rate), and AD is the acceptable data [%]. This criterion was evaluated in an error span from 0% to 50% because more than 90% of experimental records were included therein (see Fig. 3). Equation (1) uses the residual of the tolerable error because this is the only parameter that is wanted to be the lowest as possible, in contrast to the classifier performance and the acceptable data, where these parameters are sought to be the highest. Based on the results from this quality estimator (Fig. 4), the KNN classifier has the best performance with a tolerable error between 25% and 30%. Following expert criteria, the tolerable error was set at 25%, i.e., every experimental measurement with an error percent below 25% was classified as acceptable, otherwise as not acceptable. The non-acceptable records include around 75% of the data reported by Majumder (Kerosene, Lube Oil/Air) and Schmidt (Povidone Water/Nitrogen), 146 records for upward directions between 5°and 90°, 209 of nearly horizontal records (À5°to 5°), and 138 records of downward experiments (À90°to À5°). These records correspond to 309 records with a diameter lower than 30 mm, 394 records within 30-60 mm, and only one record greater to 100 mm. Regarding the fluid velocities and the void fraction, the VsL and VsG had a greater proportion between 1e-01 and 1 m/s, and the majority of the records (63%) had a void fraction lower to 0.5. After the supervised classification, a total of 5806 over 7702 acceptable records were obtained, which corresponds now on as the experimental dataset for the flow pattern map construction.    Slug flow; 4. Bubble flow; 5. Two-phase oil/water; 6. Single phase gas; 7. Single phase oil; and 8. Single phase water. For this work, it was not contemplated the flow regimes 5, 6, 7, and 8, whereas the flow regimes 0 and 1 were coupled as one flow regime (Stratified). For the sake of simplicity, the flow regimes are denoted in this work as 0 (Segregated Flow) -SG; 2 (Annular Flow) -A; 3 (Intermittent Flow) -IT; and 4 (Bubble Flow) -BF. The OLGA Multiphase Kit assumes that for vertical or near vertical (±75°or more) there is not segregated flow, but only annular flow. This assumption is linked to the gravitational forces that impede to observe this phase for these inclinations (as in Fig. 1d). In these cases, the annular flow represents specific types of segregated phases. For horizontal cases, OLGA does not recognize an annular flow pattern, and this pattern is grouped with the segregated flow pattern (0 or SG) [76,77].
The experimental and synthetic datasets overlapping for a horizontal pipe are shown in Figure 5a based on superficial velocities and Figure 5b Reynold numbers. These figures show that the experimental and the synthetic datasets mainly agree in their flow regimes, and only an experimental subset was misclassified by predicting an annular flow, which is a flow pattern that was not obtained in the synthetic dataset. Nevertheless, it should be pointed out that it is not possible to generate a grid for the synthetic dataset using the Reynolds number since the randomization of the fluid combinations would generate nonphysical superficial velocities to match a certain Reynolds number. This impossibility is generated mainly by the liquid viscosity, which varies in several orders of magnitude.

Probabilistic flow map
The primary location of the flow regimes can be determined through the distribution depicted in Figure 5; however, this map cannot assess transition areas yet, and points may be overlapped. Then, an alternative visualization is proposed base on the Probability Density Functions (PDF) of each flow pattern for both liquid and gas velocities. These pdfs are determined from empirical distributions using histograms and cubic splines to obtain smoother results. The obtained results are shown in Figure 6, where it can be seen that an overlapping classification over the transition areas between two flow patterns exists. For example, it can be seen that Segregated (SG) and Annular (A) flow regimes transition area is mostly located in a gas velocity above of ties, the specific transition area between two flow patterns could not be assessed, and there is still a great uncertainty surrounding the classification results. Therefore, an alternative map was proposed to identify these transition areas regardless of an overlapping problem. A translucent flow pattern map with contour lines was suggested for this purpose.
To obtain the translucent flow pattern map, a refined mesh of 200 901 elements (matrix of 501 rows and 401 columns) was developed. This mesh determines the number of points over the entire span of velocities for every flow regime. Projected surfaces for each flow pattern were developedusing a specific level of transparency given their order of appearances -, and contour lines were added to these surfaces to show the concentration levels of each regime. The translucent obtained map is shown in Figure 7. Note that an important number of measurements are located in two transition areas between different contour regimes.
A probabilistic approach based on the aforementioned refined mesh and a Bayesian approach is proposed to deal with the transition areas. According to the Bayes formula, a posterior probability could be determined as [74]: where P(Y = y) is the prior probability of belonging to category y, P(X = x|Y = y) is known as the likelihood function, P(Y = y|X = x) is the posterior probability, and P 8i2Y PðX ¼ xjY À iÞPðY ¼ iÞ is known as the evidence, however, it can be viewed as a scale factor for the posterior probability.
The following considerations were implemented to calculate the posterior probability for every flow pattern: 1. The prior probabilities were obtained by the ratio between the number of points in the category y (N y ) and the total number of points (N T ). 2. The likelihood function is a term chosen to indicate that the category for which it is large is more "likely" to be the correct category [74]. Therefore, surfaces from a refined mesh were obtained for all regimes, as is depicted in Figure 8. Z y (i, j), denotes the height of the surface from the flow regime for a given liquid and gas velocities (i, j). Appendix A.1 in Supplementary Material describes with more details the proposed procedure.
The probabilistic flow pattern map was determined for every flow regime for an inclination of À90°(vertical downward pipe), 0°(horizontal pipe) and 90°(vertical upward pipe) and depicted in Figure 9. These probabilistic maps represent encouraging tools for rejecting possible flow patterns under a high level of probability in an experimental dataset. Note that stratified and annular flow patterns do not have significant contributions for the same inclination, which suggest a significant result in the calculation of the error rate for further experimental assessments. Besides, there is an interesting behavior in the intermittent flow pattern, which is favored by upward vertical pipes in comparison to horizontal pipes; however, for the downward case (À90°), this is not the case, and an annular flow is favored due to the gas properties. Figure 9 also shows that the probabilities of belonging to annular (for 0°) and intermittent (for À90°) flow patterns seem to be negligible; nevertheless, these probabilities are 0.25 and 0.05, respectively.
Finally, the obtained flow pattern map is compared with two commonly used flow maps used in industry (Baker [2] and Taitel and Dukler [10]) in Figure 10. The map proposed in this work uses transition zones not as boundaries, but as a probability area where there may be more than one pattern at the same time. Besides, maps found in the literature are generally applicable to experimental conditions under specific operating and pipe configurations. On the contrary, the approach proposed in this work was generated from a significant experimental and synthetic dataset, so a broader number of cases may be applicable.

Conclusion and future perspectives
A probabilistic flow pattern map is proposed for liquid-gas phase flow pipelines based on a comprehensive experimental dataset and synthetic records obtained from OLGA. This map aims to predict probable flow regimes given gas and liquid superficial velocities and to evaluate possible transition zones among them, which are sought to replace traditional transition boundaries. This map was developed for several inclinations upward and downward from À90°to 90°with steps of 5°.
To build this flow pattern map, acceptable records from the experimental dataset were selected using a tolerable error from synthetic records and their reported void fraction. For those records lacking this parameter, a supervised learning process was proposed to complete this classification. For this purpose, two learning techniques (i.e., SVM and KKN) were considered. Finally, a Bayesian approach was considered based on the available information and the overlapping information from the experimental and synthetic datasets.
The proposed approach is an alternative to the current flow pattern maps, which are somehow limited to the configurations under they are developed. The synthetic records were determined using random properties from OLGA subjected to the ranges obtained from the experimental dataset. Similar approaches can be proposed using different simulation software, bearing in mind the uncertainty surrounding the experiments and simulations. In case new records are added to the current database, a robust tool can be proposed. For instance, data can be extrapolated to cover a wider number of tilt angles and not only every 5°.