New empirical correlations for predicting Minimum Miscibility Pressure (MMP) during CO2 injection; implementing the Group Method of Data Handling (GMDH) algorithm and Pitzer’s acentric factor

. Miscible injection of carbon dioxide (CO 2 ) with ability to increase oil displacement as well as to reduce greenhouse effect has become one of the pioneering methods in Enhanced Oil Recovery (EOR). Minimum Miscibility Pressure (MMP) is known as a key indicator to ensure complete miscibility of two phases and maximum ef ﬁ ciency of injection process. There are various experimental and computational methods to calculate this key parameter. Experimental methods provide the most accurate and valid results. However, such methods are time consuming and expensive leading researchers to use mathematical methods. Among computational methods, empirical correlations are the most straight-forward and simple tools to precisely estimate MMP, especially for gases with impurities. Furthermore, in predicting the miscibility state of oil – gas system, phase behavior is a vital issue which should be taken into account to achieve reliable results. In this regard, equations of state have an indisputable role in predicting the phase behavior of reservoir ﬂ uids. Remarkable improvements have been introduced to elevate performance of equations of state, based on Pitzer ’ s acentric factor. Hereupon, this study aims to enumerate acentric factor of injected gas (impure CO 2 ) as a correlating parameter alongside conventional parameters including reservoir temperature, oil constituents (molecular weight of C 5+ , ratio of volatiles to intermediates) and critical properties of injected gas (pseudo-critical pressure & temperature). Thus, in this study an effective empirical correlation is created, implementing the Group Method of Data Handling (GMDH) algorithm along with including the acentric factor of injected gas, which eventuated to precise predictions of MMP for impure CO 2 injection. The GMDH is one of the most robust mathematical modeling methods for predicting physical parameters using linear equations. A comparison with well-known correlations, demonstrated at least 2% improvement in average absolute error with enumerating the acentric factor and the ﬁ nal error was equal to 12.89%.


Introduction
Minimum Miscible Pressure (MMP) is the minimum pressure at which first or multi-contact miscible displacement takes place.This parameter plays an important role in selecting miscible flooding method for Enhanced Oil Recovery (EOR) process according to the type and characteristics of oil reservoirs.An accurate estimation of MMP results in appropriate surface facilities design for gas injection, management of costs, and optimized injection pattern.
Various gases are utilized for the injection purpose, including natural gas, flue gas, nitrogen, and supercritical CO 2 resulting in various levels of success in operational and economic aspects [1,2].Among mentioned gases, high solubility of CO 2 in oil reservoir results in an extreme mass transfer between the phases [1,2], interfacial tension reduction which increase oil sweep by reducing viscosity between the phases [3] and final recovery up to 90% [4,5].Injecting CO 2 as a greenhouse gas and removing it from atmosphere has also environmental benefits by storing this detrimental gas under the ground [4,[6][7][8].Another advantage of CO 2 injection is the considerable reduction of MMP and higher number of potential strategies for CO 2 flooding, in comparison to other gases [2].This is caused because of CO 2 's higher molecular weight in comparison to other usually used hydrocarbon gases such as Methane and Ethane and also no corrosion problems in comparison to H 2 S [3].
To obtain MMP, different experimental and computational methods are available.Experimental methods include slim tube test, Vanishing Interfacial Tension (VIT) technique, multi-contact mixing-cell experiment and rising bubble apparatus [1].Computation methods contain two main groups: (1) Equation of State (EoS), (2) empirical correlations.For multicomponent injections, semi-analytical and multiple-mixing-cell methods implementing EoS would be appropriate.However, lots of steps for calibrating the EoS with respect to laboratory data must be carried out, which would be complicated.For pure gas injection or injection with small degree of impurities, empirical correlations would perform properly.
One of the most accurate experimental methods for determining MMP is the slim tube test where the oil and gas displacement process in porous medium is simulated.Due to horizontal position of slim tube and low pressure drop across the tube surface because of its small diameter, fingering phenomena and gravitational effects are eliminated.This will result in more accurate MMP measurements.On the other hand, performing slim tube experiment is substantially time and money consuming [1,9].Therefore, application of this in miscible injection design where a great number of MMPs should be determined would not be feasible.In fact, the major application of slim tube test is to calibrate EoS for phase equilibria calculations and to develop empirical correlations for MMP prediction in pure and impure gas injections.
In this study, we focused on empirical correlations applied for predicting MMP in pure and impure gas injections.Empirical correlations are mathematical models developed with respect to experimental data.In empirical correlations, MMP is correlated to physical parameters of oil, gas and thermodynamic conditions.Application of these models has eliminated the need for repeating timeconsuming and costly experiments for each injection [10,11].
Cronquist [12] considered reservoir temperature, C 5+ molecular weight, and light components mole fraction as the key parameters of MMP prediction.Although, Lee [13], Yellig and Metcalfe [14] and Orr and Jensen [15] considered reservoir temperature (T) as the key parameter of MMP prediction.
To account for impurities in MMP calculations, correction factors were applied to MMPs obtained for pure gases.Various correction factors were introduced by researchers.Sebastian et al. [16] presented the molar average critical temperature of the mixture (T cm ) as the most accurate parameter to correlate impure MMP.
Alston et al. [5] presented a correlation for pure MMP based on reservoir temperature, C 5+ molecular weight, and the ratio of light to intermediate components of the reservoir oil.Impure MMP was calculated by multiplying a correction factor based on pseudo-critical temperature.Emera and Sarma [17] presented a correlation using genetic algorithm based on reservoir temperature, C 5+ molecular weight, and the ratio of light to intermediate components, similar to Alston et al. [5].Liao predicted MMP for low permeable reservoirs.In addition, impure MMP was obtained by a parameter called the relative MMP, which was the ratio of impure to pure MMP.
Fathinasab and Ayatollahi [1] introduced a correlation for MMP prediction combining genetic programming with the multivariate search method based on reservoir temperature, C 5+ molecular weight, injected gas pseudocritical temperature, and the ratio of light to intermediate components.Using gene expression programming, Ahmadi et al. [4] predicted MMP based on parameters proposed by Fathinasab and Ayatollahi [1], Liao et al. [18], and Alston et al. [5] (T, T cm , light to intermediate components ratio, and C 5+ molecular weight).However, implementing these implicit methods in operational applications would be complex.
The above mentioned empirical correlations have significant errors for high-temperature reservoirs.Furthermore, applying correction factors to pure MMP in order to predict impure MMP would be very erroneous, since the error of pure MMP calculation can also affect impure MMP results and causes additional error.
This study aims to provide an accurate, explicit and simple empirical correlation with less computational errors compared to prior correlations for MMP prediction.The necessity to present correction factors for impure MMP prediction have been also eliminated and two separate correlations were developed to predict pure and impure CO 2 MMP using multi-variable optimization algorithms.Furthermore, as a unique feature of this study, two new parameters considering the ratio of reservoir temperature to pseudo critical temperature as one parameter and acentric factor (ѡ) for the other one, have been implemented in impure MMP correlation.These two parameters can effectively determine the impact of injected gas impurities on MMP predictions.
2 Data analysis/Experimental 2.1 Data analysis Data are divided in two categories.The first category includes 126 data points which will be used to develop the correlation for pure CO 2 injection.The second category includes 126 data points which will be used to develop the correlation for impure CO 2 injection.
Most part of data that are used in this study have been collected from the literature [5,7,[16][17][18][19][20][21][22][23][24].It is worth mentioning that previous correlations had also been developed based on these set of data [5,7,[16][17][18][19][20][21][22][23][24].Therefore, a unique and new dataset is not considered to develop the correlation, except for some data points, which are added from Iranian oil fields in both pure CO 2 injection (two points) and impure CO 2 injection (four points).The reason to add Iranian oil fields dataset was to increase the range of temperature and because Iranian oil fields are involved with relatively high-temperature deep reservoirs.

Experimental
Iranian datasets are obtained through slim tube experimental result.Tables 1 and 2 represent the oil properties of these reservoirs.
The characteristics of the slim tube used in this study are presented in Table 3.
Slim tube is a stainless steel, packed with glass beds that fairly simulates one-dimensional flow through pore geometry.Before starting the test, toluene is injected in the slim tube in order to clean it and after that N 2 is injected to remove the remaining amount of toluene.Moreover, a vacuum pump evacuates the porous media for several hours.
At the beginning of the fluid displacement tests, the slim tube system is saturated by the oil (reservoir fluid) at the reservoir temperature and a pressure above the bubble pressure.Then the gas with a constant flow rate is injected into the tube (1.2 pore volume) by an injection pump for miscibility process to occur.This process is repeated in several pressures and a sight glass is imbedded for the flow/process observation.There are also an accumulator and measuring systems at the end of the tube to measure gas breakthrough through, checking the producing gas-oil ratio and composition as functions of the injected volume.The schematic diagram of the slim-tube test used in this study is shown in Figure 1.
The common experimental procedure for determining MMP of CO 2 /Crude oil system, once the injection CO 2 becomes miscible with crude oil, an inflection point is observed in the curve of recovery factor with respect of displacement pressure and the recovery will not improve as much above with a step change in pressure (Fig. 2).
Table 4 represents the experimental MMP data obtained in this study.As can be seen in this table, three experimental points were obtained for each fluid sample using the slim-tube testing.

Theory
The Group Method of Data Handling (GMDH) was developed by a Russian cybernet specialist, Prof. Alexey Ivakhnenko, in 1966.In standard regression models, the only criterion is the least squared error, and thus it cannot be determined whether the final model is simple or complex.However, using in the Ivakhnenko polynomials, one can obtain a polynomial with optimal complexity [25,26].
The GMDH algorithm is robust and gives unique answers and produce linear and explicit correlations.This method is very suitable for solving complex and multidimensional problems with limited data [27] as with the case we have encountered in this article.
The GMDH algorithm creates a process for developing higher order polynomials as in equation ( 1): which relates m input parameters u 1 , u 2 , ..., u m to a single target parameter called y.In the GMDH algorithm, it is not necessary to use all formats of the summations (dou- 2.6 T = 107.22(°C) Mw C5+ = 162.3g/gmole ble, triple, quadruple, etc.) in equation (1).In fact, it depends on the difficulties involved in modeling of a system.In systems with more variables involved, a higher order summation might be essential for accurate modeling.In this study we intended to make the correlations as simple (consisting of less constants) as possible while keeping a significant accuracy.In this regard, based on the optimization process, some variables from the regular and double summation terms are utilized for the modeling purpose as can be seen for pure CO 2 MMP system (Sect.4.1) and for impure CO 2 MMP system (Sect.4.2).Hence, implementing the triple and quadruple summations makes the correlation more complex while not improving its accuracy noticeably here.
We have divided the data into training and testing datasets.Training datasets will be implemented in developing the correlation and testing datasets will be applied for validation.To this end, 70% of data are randomly used for training and 30% of data are assigned to the test subset.
In the first step, all possible first-order polynomials are created for all existing input parameters.In this case, the constants are determined in such a way that the resulting polynomials have the least sum of squared error compared to the training data.For example, assuming only linear relationship, the following polynomials are constructed for a three-parameter function as can be seen in equation ( 2): In the second step, for each of these polynomials, the sum of least squared error is calculated for the test data as mentioned in equation (3): where z i represents the ith test data and y ij represents the ith corresponding prediction of the jth equation.d j is the sum of least squared error of the jth polynomial.
In the third step, the polynomial with least squared error is selected as the solution.If the solution is undesired, other modes such as the division or multiplication of the current parameters can be selected as new parameters to be added to the previous parameters.Thus, the number of parameters varies and therefore the number of equations.Then the process starts again from the first step.
4 Result and discussion    intermediate oil fraction (x) which is defined in equation ( 4) (mol stands for mole fraction of each component).
The proposed correlation for pure CO 2 -MMP is shown in equation ( 5): A1-A6 are constants shown in Table 6.
In this study, Mean Squared Error (MSE) is used for estimating absolute deviation.Average Absolute Percentage Relative Error (AAPRE) is applied for estimating error precisely.These deviation and error measurement methods are defined as follows: where MMP observed i is the ith observed (experimental) MMP value.MMP calculated i is the ith calculated MMP value and n is the number of data points.
For pure CO 2 , 126 data points were available.70% of these data (including 88 data points) were used to develop the corresponding correlation.The remaining 30% of data set (including 38 data points) were used to evaluate and test the obtained correlation.Performance of the proposed correlation is evaluated based on each dataset (train, test and total data) as presented in Table 7 which provides results of simulation based on AAPRE and MSE.
On the other hand, there is no certainty if the previous authors have selected a specific part of data for test or if they do, which part of data have been used for test.Therefore, it would be reasonable to compare the performance of developed correlation with other correlations based on total data points available, not just the test data.
Thus, MSE and AAPRE calculated based on total pure data points for well-known correlations in predicting pure MMPs are given in Table 8.This table is sorted based on descending values of AAPRE.
As seen in, Liao correlation [18] shows the highest error.In contrast, the correlation presented in this study has the lowest error, reducing error by at least 3.3% compared to other existing correlations.
In Figure 3 the experimental MMP graphs for three correlations with higher accuracy and minimum Error Fathinasab and Ayatollahi [1], Ahmadi et al. [4] and Emera and Sarma [17], are compared with the proposed correlation in this paper.In these graphs, the vertical and horizontal axes show the experimental MMP and the corresponding calculated MMP, respectively.Accumulation of data  4)), pseudo-critical pressure of the injected gas (P pc ), relative pseudo-reduced temperature (reservoir temperature to pseudo critical temperature ratio) (T pr ) and average acentric factor of the injected gas (x).Where the parameters T pc , P pc are defined in equations ( 8) and ( 9) in accordance with the Kay's rule [28]: T pr is also defined in equation (10): x is molar averaged based on equation (11) [29]: where y i stand for the ith component mole fraction in the gas in equations ( 8)- (11).T ci and P ci represent critical temperature and pressure of the ith component, respectively.As mentioned earlier, 126 data points were used to develop the impure CO 2 MMP empirical correlation.Range of data used for developing this correlation is given in Table 9.
It is noteworthy saying that while the temperature of the published data mostly ranged from 32.2 to 118.3 °C, utilizing Iran's data in this paper increased the temperature range according to the higher depth and temperature of these reservoirs (up to 143 °C).Moreover, the range of molecular weight of components heavier than pentane was in former studies Liao et al. [18] and Alston et al. [5], was extended between 154 and 350.3 g/gmole range of C 5+ molecular weight is extended considerably using the Iran's data.Amount of injected gas impurities for impure MMP dataset is given in Table 10.
It is worth noting that in previous studies [1,4], T and T pc were considered as correlating parameters.In this   study, a ratio of these parameters called relative pseudoreduced temperature is chosen as a correlating parameter.This choice was due to higher observed coefficient of determination between T pr and MMP relative to MMP and T pc or T.
A higher linear coefficient of determination can be measured as a representative of the correlation between the target parameter and the input parameters.It can be observed that the data correlation for T pr and MMP is the highest, compared to T C and T (Fig. 4).Therefore, it is more suitable to develop the correlation based on T pr (Eq.( 10)).
The Pitzer's acentric factor [30] was introduced in 1955 with the aim of developing the corresponding states theorem; increasing its reliability and accuracy in fluid properties modeling and prediction.This coefficient is defined as follows: where T r ¼ T T c is the reduced temperature and P sat r ¼ P sat Pc is the reduced saturation vapor pressure.Employment of acentric factor has significantly ameliorated the prediction of fluid phase behavior and calculation of reservoir fluid properties [31][32][33][34].
In evolutionary process of equations of state, Soave [31] proposed a correction factor as a function of acentric factor on the attractive term of Redlich and Kwong [35] EoS which was previously introduced merely as function of temperature.The proposed format of attractive term temperature dependency is subsequently incorporated in development of equations of state.Moreover, Pitzer's acentric factor has also played an important role in developing three-parameter equations of state, which led to significant improvements in prediction of fluid volumetric data.This impact can be clearly observed in equations of state such as Schmidt and Wenzel [33], Esmaeilzadeh and Roshanfekr [32] and Patel and Teja [34].
Success of foregone applications of acentric factor in phase behavior predictions, gave an idea to use this parameter as a correlating parameter to predict MMP alongside other previously alluded parameters.
It is worth noting that, the reason for presenting two separate correlations for pure CO 2 and impure CO 2 is to achieve the ultimate goal to adhere simplicity and accuracy, and to avoid using correction factors for impure MMP predictions which would lead up to additional errors.As with the importance and prevalence of impure CO 2 injection scenario in oil and gas industry (since pure CO 2 can be hardly accessible), it is preferred to develop a separate correlation for impure CO 2 injection to properly handling the simultaneous presence of accuracy and simplicity.
Moreover, using a dimensionless relative temperature (T pr ), a dimensionless pressure ratio MMP=P pc , relative molar ratio (x) and molecular weight of C 5+ (Mw C5+ ) which could be considered dimensionless despite its unit, all result in a semi-dimensionless correlation, and also reduce the number of correlation parameters.
For impure CO 2 , 126 data points were available.It should be emphasized that these 126 data points are completely distinct from 126 data points which were used for pure CO 2 .70% of these data (including 88 data points) were used to develop the corresponding correlation.The remaining 30% of dataset (including 38 data points) were used to evaluate and test the obtained correlation.Performance of the proposed correlation is evaluated based on each dataset (train, test and total data) and presented in Table 12 which provides results of simulation based on AAPRE and MSE.
Again, there is no certainty if the previous authors have selected a specific part of data for test or if they do, which part of data have been used for test.Therefore, it would be reasonable to compare the performance of developed correlation with other correlations based on total data points available, not just the test data.
Experimental MMP graphs in terms of calculated MMP for three correlations with the least error from Table 13 along with the correlation of this study were compared in Figure 5.In the present study data points are well-accumulated around the diagonal line.This indicates the higher accuracy of the correlation provided for MMP calculation.

Sensitivity analysis
In this study, the sensitivity analysis was performed on parameters affecting both pure and impure correlations using the relevancy factor:  In equation ( 14), input k.i and input ave.k are the ith value and the average value of the kth input, respectively.The index k refers to each enumerating parameter, e.g.Temperature, volatile to intermediate ratio, etc; MMP i stands for the ith value of predicted MMP and MMP ave is the arithmetic average of predicted MMP values.r shows the effect of each parameter (each input k ) on the correlation output (MMP in this study).If r > 0, then the associated parameter has a positive effect; in contrast, if r is negative (r < 0), the associated parameter has a negative effect.The parameter r ranges from À1 to 1, indicating the highest negative or positive effect.
The results of sensitivity analysis on pure MMP correlation are depicted in Figure 6.As can be seen, all parameters (including x, Mw C5+ , T) have a positive effect on predicted MMP.Temperature has the largest effect and C 5+ molecular weight of components has the smallest effect.The impact of these parameters in this study is consistent with the results of foregone studies [7,18,36].
The results of sensitivity analysis are rendered for the correlation of impure MMP parameters (x, P pc , Mw C5+ , x and T pr ) in Figure 7 (T pr consists of two parameters, T and T pc .).Previous researchers reported direct relation of T [7,18,36] and inverse relation of T pc [7].Consequently,    the effect of their ratio (T/T pc ) reported as T pr is ultimately deduced to have a direct relation with MMP.The effect of the parameter x and C 5+ molecular weight is similar to pure MMP.Sensitivity analysis on correlating parameters in this study confirms these results.
A gas with high molecular weight has a lower P pc .As a result, one can assume that increasing the molecular weight of gas has the same effect as decreasing P pc .By increasing the molecular weight of gas, P pc and hence the MMP decreases.The results of the sensitivity analysis also confirm this conclusion.
In general, with increasing the gas molecular weight, the acentric factor (x) increases and accordingly, MMP decreases.Therefore, inverse relation of the acentric factor on MMP can also be explained and interpreted.Among the parameters mentioned above, T pr , P pc , and ѡ showed the most while x and MW C5+ showed the least impact.

Conclusion
Reviewing experimental data in foregone studies and data from two Iranian reservoirs, a database with a higher temperature range was collected.Then, to predict MMP in pure and impure CO 2 injection operations, correlations were provided using the GMDH algorithm.
1.The correlation proposed to predict pure MMP is an explicit correlation based on MW C5+ , x, and T parameters.This correlation ameliorates the results compared to other previous correlations and reduces the computational error by at least 2.5%.Remarkable decrease of computational error corroborates robustness of the GMDH algorithm approach.2. The correlation proposed to predict impure MMP is presented explicitly without using the correction factor.The effective parameters in this correlation include T pr , x, Mw C5+ and x.Using the dimensionless temperature and presenting the correlation as MMP/P pc , this correlation is developed in a semidimensionless form.It should be noted that such non-dimensionalization not only shortens the correlation but also provides its applicability for many reservoirs.Employing the GMDH algorithm approach along with implementing the gas acentric factor, eventuated in at least 2% decrease in computational error compared to previous studies.3. Since the data from Iranian reservoirs were used to develop new correlations in this study, they can be used as means for predicting CO 2 -MMP in reservoirs with high depth and temperatures such as some Middle East reservoirs.

Fig. 2 .
Fig. 2. Schematic graph of oil recovery versus injection pressure obtained from slim-tube test.

Fig. 3 .
Fig. 3. Comparison of experimental versus calculated MMP graphs for three correlations with minimum error.

Fig. 4 .
Fig. 4. Comparison of data correlations for T, T pc , T pr .

Fig. 5 .
Fig. 5. Comparison of experimental versus calculated MMP graphs for three correlations with minimum error.

Table 2 .
Properties of field oil Yadavaran.

Table 1 .
Properties of field oil Darkhovin.

Table 4 .
Experimental obtained data from slim-tube testing.
Table 5 shows the range of changes in reservoir temperature (T) in °C, molar ratio of volatile (C 1 , N 2 ) to intermediate components (C 2 , C 3 , C 4 , CO 2 , H 2 S), average molecular weight of components heavier than pentane (Mw C5+ ), and pure MMP in MPa:

Table 5 .
Range of oil properties for pure CO 2 injection.

Table 7 .
Performance evaluation of proposed pure MMP correlation.

Table 8 .
MSE and AAPRE error for pure MMP.
around the diagonal line in these graphs indicates the accuracy of each correlation in prediction of MMP.As can be seen, the graph plotted for the present study shows the best accumulation around the diagonal line y = x.

Table 9 .
Range of oil properties for impure CO 2 injection.

Table 10 .
Range of impurities along with injected CO 2 .

Table 11 .
Correlation parameters for impure MMP.

Table 12 .
Performance evaluation of proposed for impure correlation.

Table 13 .
MSE and AAPRE error for impure MMP.