A Genetic Algorithm Integrated with Monte Carlo Simulation for the Field Layout Design Problem

Oil and gas production is moving deeper and further offshore as energy companies seek new sources, making the field layout design problem even more important. Although many optimization models are presented in the revised literature, they do not properly consider the uncertainties in well deliverability. This paper aims at presenting a Monte Carlo simulation integrated with a genetic algorithm that addresses this stochastic nature of the problem. Based on the results obtained, we conclude that the probabilistic approach brings new important perspectives to the field development engineering.


Introduction
The field layout design problem is a growing concern for the offshore sector. As the water depth of oil fields increases, difficulties in operating at high pressures and low temperatures appear. In order to reduce the huge investment and operational costs, it is important to develop a detailed plan to lower the losses involved. A potential approach is to place and allocate the main facilities such as platforms and manifolds in an optimum manner in order to avoid the pressure loss in flowlines, so the reservoir energy can be fully exploited.
The field design problem is related to many factors, such as seabed topography, reservoir volume, well flow rates, drilling and facilities costs. Several authors studied this problem, among which the following can be highlighted. Rothfarb et al. (1970) initially proposed techniques and heuristic procedures to minimize the investment and operational costs through the optimization of pipeline diameter, network and expansion of gas fields. Devine and Lesso (1972) wrote the seminal paper on the field design problem, which considered the trade-offs between platform capacity and drilling costs through an iterative two-stage algorithm. Frair and Devine (1975) presented a nonlinear mathematical model that besides including the locationallocation of platforms, it considered the scheduling of oilfield operations and production rates for different time periods, although assuming a linear production decline curve for each reservoir. Dogru (1987) proposed a nonlinear mixed-integer programming model to optimize platform and well locations and to maximize the total productive potential. However, the calculations became prohibitive after five platforms and 1 000 possible well locations. Grimmett and Startzman (1988) presented an integer programming model which used the binary implicit enumeration method in order to determine the size, location and allocation of major offshore facilities. However, such model had to consider an exponential quantity of solutions as the number of variables increased. Hansen et al. (1992) formulated the optimal well assignment to platforms as a multicapacitated plant location problem, both as an integer programming model and a tabu search heuristic. The exact model faced many numerical difficulties above 30 possible locations for platforms and 100 wells. Carvalho and Pinto (2006) maximized the Net Present Value (NPV) of several oil field development plans through a proposed mixed integer model integrated with a bilevel decomposition algorithm in order to solve large-scale problems. The master problem determined the assignment of platforms to wells and a planning subproblem calculated the timing for the fixed assignments. Multiple reservoirs were also considered within the model. Although it could solve problems of realistic dimension, further studies on more efficient methods for dealing with investment constraints were recommended by the authors. Rosa (2006) aimed to maximize the NPV of a field development project by optimizing the platform location in a grid through an exhaustive search model, determining for each solution the production rate based on the pressure drop across the production lines, having also considered risers and flowlines costs. As the method requires exhaustive search, the computational time increased drastically with the number of nodes considered. Rosa and Ferreira Filho (2012) broadened this study by also considering the location of manifolds and the interconnections between manifolds and wells. Wang et al. (2012) proposed a mathematical model for the layout optimization of cluster manifolds and programmed its algorithm in MATLAB. Numerical analyses were performed and discussed for a case study of subsea wells partition to verify the accuracy of the model and the feasibility of the algorithm. Wang et al. (2014) further studied this problem by optimizing the layout of cluster manifolds with pipeline end manifolds, where a mathematical model and its dedicated algorithm in MATLAB were also proposed. Both papers concluded that the optimization layout problem can be described accurately by the presented mathematical models and the convergence rate of the given algorithms is efficient. Rodrigues et al. (2016), in order to minimize the development costs of an oil field, proposed a binary linear programming that integrated interconnected decisions such as the number and location of wells and platforms, the location of manifolds, the well geometry and the production capacity of the platforms and its interconnection between manifolds and wells. Two case studies were proposed, and the results were consistent with reality. Zhang et al. (2017) proposed a mixed-integer linear programming model to optimize the production of a well fluid gathering system by minimizing the total investment. Feeding terrain and obstacle conditions to the model along other operational constraints, the optimal topological structure, position of the central processing facility, diameter and route of each pipeline were obtained integrally by solving this model with GUROBI solver. Two virtual oil-gas fields and a real-world gas field were taken as case studies in order to verify the reliability and practicality of the model.
The usual approach of field layout design studies is to minimize the investment costs of oil and gas facilities, as well as maximize the NPV of the project, as presented in T upac et al. (2002), Sales (2010), Souza (2011), andRahmawati et al. (2012). The location of facilities is a hard decision process because it involves different and sometimes conflicting criteria, besides significant savings among each possible alternative.
In addition, at the start of the development project, there is insufficient information about the reservoir to estimate oil production accurately (Serapião et al., 2012;Touzani and Busby, 2014). Hence, deterministic approaches for the field layout design problem may not perform well in the presence of these uncertainties because they address poorly or even ignore completely the many possible scenarios that could occur in the future. In order to evaluate problems of the petroleum industry considering uncertainties, statistical analyses and statistical simulators were proposed, which we highlight Murtha (1994), Huffman and Thompson (1994), Gilman et al. (1998), Kitchel et al. (1997), Cheng et al. (2010) and Can and Kabir (2012). However, no studies about the field layout design regarding the uncertainty aspect are known to the authors. Therefore, this paper aims at presenting a genetic algorithm to obtain adequate solutions considering the probabilistic nature of the field layout design problem through a Monte Carlo simulation. A greedy algorithm is employed to evaluate the performance of the genetic algorithm. We also provide a set of instances to allow comparison among future work.
The remainder of this paper is structured as follows: in the next section, the problem is stated in details; in the third section, the proposed approach is presented; in the fourth section, case studies are presented; finally, in the fifth section the computational results are discussed.

Problem statement
As already mentioned, the field layout design depends on well flow rates, which depends on the pressure drop of the flowlines and therefore of the energy balance of the system, shown in its differential form in equation (1) (Economides et al., 1994): in which p is the pressure in the tube, r is the density of the fluid, u is the fluid velocity, g is the gravitational constant, f f is the Fanning friction factor, L and D are respectively the length and the internal diameter of the tube, and W s is the shaft work realized in the system. Since we will not consider any device that does shaft work in the flowlines, W s = 0. Considering that the tube walls are thermally insulated, the oil temperature can be considered constant during its production, thus variations in viscosity and in density of the fluid are negligible. In this paper, we also consider an incompressible monophasic oil flow. Then, we can integrate the last equation, resulting in equation (2) where Dz is the height difference between the extremes of the tube. The three main components of the pressure drop are on the right-hand side of the equation, being respectively the potential energy, the kinetic energy and the pressure drop. The higher the pressure drop, the higher will be the energy requirements to produce fluids. The tube diameter, both for pipelines (tubes that connect the platform to the coast) and flowlines (wellplatform, well-manifold and platform-manifold connections) are considered constant. Given the variations in height between the seabed and the sea surface are negligible for this study, we adopt a constant average value for it.
If the water column and the density of the fluid are constant, the potential energy is also constant. Besides, there are no variations in kinetic energy for an incompressible fluid flowing through a constant cross-sectional area. Thus, based on the mentioned considerations, analyzing kinetic and potential energy is unnecessary.
The pressure drop is a function of flow rate and tubing length. Having in mind that the flow rate is the main aspect of the Monte Carlo simulation applied in this study and the tubing length is directly related to the location of platforms and manifolds (which will be further addressed in this paper as receivers), here we will address the field layout design problem by minimizing the sum of the friction loss in the flowlines and pipelines as shown in equation (3): where P is the set of all segments of pipe in the oil field, q i is the flow rate in pipe i, L i is the pipe i length and D i is the internal diameter of the pipe i. This approach agrees with Rosa (2006), which states that platform location should be chosen considering the pressure loss in flowlines so the reservoir energy is fully exploited. The Fanning friction factor f f,i of the flow in tube i is a function of the Reynolds number (Re i ) and of the relative roughness of tube i (e i ), being usually calculated using Colebrook-White equation or its graphical form, the Moody's chart. A non-iterative and accurate approach to calculate f f is given in equation (4) (Chen, 1979): For laminar flow, the friction factor is determined by equation (5) (Moody and Princeton, 1944): The Reynolds number for the pipe is calculated through equation (6) (Economides et al., 1994): where m is the fluid viscosity. As relative roughness, density, internal diameter and viscosity are considered constant, the objective function of the problem is a mere function of flow rate and pipe length.
3 Proposed approach

Monte Carlo simulation
The Monte Carlo simulation, developed by Metropolis and Ulam (1949), is a process that runs a model numerous times, randomly selecting each variable value according to its probability distribution curve in order to create many possible scenarios. Computers allow the model to run thousands of times in a feasible time. The analysis of the resultant scenarios may show the most probable case along with statistical data which allows the understanding of the uncertainty involved. The Monte Carlo simulation is an alternative to deterministic approaches.
Our objective is to generate many production scenarios of an oil field using Monte Carlo simulation. To this end, we must assign flow rates to the wells using a flow rate probability distribution curve. These many scenarios will then be compiled in an instance to be solved by the genetic algorithm.
In order to study well flow rates in a stochastic scenario, the decline exponential curve, presented in equation (7), which relates the well flow rate q at a given time t, can be used in a Monte Carlo simulation in a similar manner as Gilman et al. (1998), addressing the decline rate (a) and the initial flow rate (q 0 ) as random variables: Therefore, the decline exponential curve does not appear as a single curve, but as a probabilistic region. In order to avoid processing computationally expensive well models and reservoir simulations to obtain a probability distribution curve for the decline rate, we defined the well and reservoir coupling by a simple Productivity Index (PI) model. For a volumetric oil reservoir at pseudo-steady state flow, the decline rate can be defined as presented in equation (8), where k is the rock permeability, h is the reservoir net pay, m is the fluid viscosity, N i is the initial oil in place in the well drainage radius (r e ), r w is the well radius, c t is the total compressibility of the reservoir, and s is the well skin factor (Guo et al., 2007). It is important to note that this PI model is employed here for the sake of simplicity. That is, we chose a simpler PI model because it represents the reality in a sufficient level of detail for the objectives of this paper. Therefore, PI models that consider prior geological knowledge such as compartmentalized and unconventional reservoirs, as presented in Shahamat et al. (2016), or PI models that analyze interwell connectivities by assessing the coupling of wells between themselves and the geological formation, as presented by Noetinger (2016), are recommended for a more detailed analysis.
Based on data from many points of an oil field, it is then possible to define probability distributions for each one of these properties, therefore enabling a Monte Carlo simulation. After carrying out the Monte Carlo simulation, flow rates for each well were drawn from the resulting flow rate probability distribution curve, creating what is called here a universe, which is a scenario of producing wells that is possible to happen in the oil field.

Greedy algorithm
An algorithm is classified as greedy if it always chooses, at each step of generating a solution, the best option available at the moment. For example, a greedy algorithm for the field layout problem will always allocate the receiver to the best well cluster available. More information about greedy L.P.A. Sales et al.: Oil & Gas Science and Technology -Rev. IFP Energies nouvelles 73, 24 (2018) algorithms can be found in Cormen et al. (2009). The main reason we compared our genetic algorithm with a greedy one is because there are no similar approaches to the field layout design problem in the literature that we could compare to our method. Thus, we relied on comparing to how a decision maker would solve this problem nowadays, considering uncertainties. It happens that the greedy algorithm is a close approximation of how humans make decisions over complex problems, such as in the knapsack problem (Murawski and Bossaerts, 2016). Albeit simple by definition, the greedy algorithm is similar to the method a human would use when designing an oilfield layout with dozens of wells in such conditions and thus an interesting comparison.
In order to evaluate and compare the solutions obtained by the genetic algorithm, we use a greedy algorithm as described in Figure 1. It can be divided into two phases: the location and allocation of (i) manifolds and (ii) platforms.
In the greedy algorithm, the manifolds are placed on the seabed by descending order of production capacity, one by one. To easily understand the algorithm, an analogy between the flow rate of the wells and the mass of objects in a physical system can be made. First, the "center of mass" of the well layout is calculated as shown in equations (9) and (10), where T is the set of all wells in the oil field, q i is the flow rate of well i, (x i , y i ) are the Cartesian coordinates of the well i, a i ∈ (0, 1) is equal to 0 if the well i was already allocated, and 1 otherwise: The biggest manifold (in terms of production capacity) available is then placed in the center of mass determined, ending its location procedure. Next, the wells are allocated to the manifold, which are selected by ascending order of distance to the manifold. When the manifold cannot connect to the next well due to capacity restraints, its allocation ends. These steps repeat to all manifolds available.
The procedures for the platform location and allocation are similar. The platforms are placed by descending order of production capacity, one by one. The main difference is in how the center of mass is determined: it now considers all non-allocated wells and manifolds in the oil field. The biggest production platform will be placed in the determined center of mass, and then the closest wells and manifolds will be assigned to it up to the point the platform cannot handle the additional production. The location and allocation of the platform are then complete. This procedure repeats for all available platforms, thus completing the greedy solution. Finally, its fitness is evaluated.

Solution codification
The codification for both greedy and genetic algorithms is the same, having one vector (technically addressed as the chromosome in a genetic algorithm) for allocation and another for location. The allocation is represented by a vectorã ¼ a 1 ; a 2 ; :::a w ; b 1 ; b 2 ; :::b m f g , where w is the number of wells in the oilfield, and m is the number of available manifolds. For example, the i-th well connects to receiver a i , and the j-th manifold connects to the platform b j .
For example, an instance with 11 wells, 2 manifolds and 2 platforms could be encoded according to Table 1. A chromosome for the allocation subproblem of this instance could be: {0, 1, 0, 3, 3, 2, 2, 2, 3, 1, 1, 1, 0}, which means that the first well is connected to platform 1, the second well is connected to the platform 2, the third well is connected to platform 1, the fourth well is connected to manifold 2 (represented by number 3) and subsequently up to the last two values, which represent the manifold connections. The manifold 1 is connected to platform 2, while manifold 2 is connected to platform 1. This example is illustrated in Figure 2. The well index (in red) refers to which receiver the well is allocated to, while the manifold index (in blue) refers to which platform the manifold is allocated to.

Genetic algorithm
The genetic algorithms, initially proposed by Holland (1975), were deeply studied over the years, especially in engineering after the work of Goldberg (1989). Today, genetic algorithms are very popular in several areas, such as   (Kadri and Boctor, 2018;Masmoudi et al., 2017), thermodynamics (Ahmadi et al., 2016(Ahmadi et al., , 2017 and petroleum engineering (Ghorbani et al., 2016;Sheremetov et al., 2018), as they combine deterministic and stochastic elements in order to generate high-quality solutions. In a genetic algorithm, a solution is represented by what is called an individual. The individuals have one or more chromosomes, which are binary strings (or some other codification) that represents a feasible solution. The element of a string is called a gene. Several individuals are generated, randomly or not, creating a set of individuals called population. Each individual has its fitness evaluated relative to the fitness function adopted. Then, the algorithm randomly selects individuals and crosses them, meaning that the individuals' genome is recombined (and possibly mutated) into a new offspring, thus forming a new generation. The offspring may replace a chromosome of the population. The process of creating a new generation iterates until the decision-maker is satisfied with the quality of the fittest individual of the population. The genetic algorithm pseudocode applied here is presented in Figure 3. It can be divided in population generation, binary tournament, fusion crossover and chromosome replacement.

Population generation
In order to generate the initial population, we tried to use a grid-based technique, calculating each node probability of receiving a platform or a manifold proportionately to the wells and manifolds' flow rate in a given radius, and then applying the classical assignment method seen on Ghoseiri and Ghannadpour (2007). However, this technique proved to be very time-consuming in dense grids. Therefore, a gridless technique was employed, similar to the GRASP metaheuristic seen on Resende (1989, 1995). The chromosomes are built from a randomized version of the greedy algorithm employed here. The difference in the randomized version is that manifolds and platforms are randomly chosen and the wells in the manifold allocation and the wells and manifolds in the platform allocation are randomly allocated.

Binary tournament
For all F generations, the binary tournament is performed, comparing the fitness of two randomly chosen solutions in a pool. There are two pools (or arenas). The best solution in their respective pool will be selected for crossover.

Fusion crossover
The crossover method used here is the fusion crossover proposed by Beasley and Chu (1996), both for allocation and location chromosomes. For each gene in the chromosome, given that parent 1 fitness is a 1 and parent 2 fitness is a 2 , the child has a probability p 1 ¼ a 1 a 1 þa 2 of receiving parent 1 gene and p 2 ¼ a 2 a 1 þa 2 of receiving parent 2 gene. In Figure 4 is illustrated the binary tournament and the fusion crossover procedures. The solutions randomly selected in the population are highlighted in orange, while the solutions with the highest fitness in the binary tournament are colored in green. The blue color represents the parent 1 genes passed on to the offspring, and the red color, the parent 2 genes.

Mutation
There is a probability of mutation in the offspring for both allocation and location chromosomes, where a random gene is shifted to a random value between the gene's feasible range. Figure 5 illustrates the mutation procedure.

Chromosome replacement
The distances between receivers, wells and other receivers are calculated for the offspring. The fitness of the offspring is then evaluated, and if it has a greater fitness than the parent with the lowest fitness, the offspring replaces this parent.

Algorithm tuning and hardware
The algorithm parameters were tuned manually in order to obtain an optimum solution in a feasible computational time. The population size was set to 1 000; the number of generations was set to 100 000 and the mutation probability, both for location and allocation subproblems, was set to 2%. The greedy algorithm was initially implemented in Microsoft Excel using Visual Basic for Applications, however the computational time required for a single universe was greater than 10 s, a value extremely high compared to solving the same universe in a greedy algorithm implemented in C language, as seen further. Therefore, greedy and genetic algorithms were implemented in C and executed on an Intel i5-5200U with 8 GiB RAM, using the Debian GNU/Linux 8.6 operating system. The instances used here are available at Sales et al. (2017b), while the source code of the algorithms is available at Sales et al. (2017a).    The general approach presented here is illustrated in Figure 6. Probability curves of fluid and reservoir properties were fed to a Monte Carlo simulation, generating a flow rate probability distribution curve. Then, flow rates for each well were drawn from this curve in order to generate a universe. Several universes were generated, creating an instance problem. The greedy and genetic algorithms solved this instance, and its raw results were converted to information through statistical analysis of the mean, standard deviation and the use of heat maps. The information now displayed in tables and heat maps were subjected to statistical inferences and conclusions.

Case study 1: Wilmington-Rosa
For the first case study presented, fluid and reservoir properties of the Wilmington field were taken from National Energy Technology Laboratory (1984) database, which were used for generating probability distribution curves for many variables (fluid viscosity, rock porosity and permeability, reservoir area, net pay, oil formation volume factor, number of producing wells, and initial oil saturation) since fluid and reservoir properties of Rodrigues et al. (2016) instances were unavailable to the authors. The probability distribution curves of the variables were selected through Akaike Information Criterion, verifying its p-values through the Kolmogorov-Smirnov test.
The Monte Carlo simulation was then performed, obtaining a probability curve for the flow rate of the wells, given a time t = 5 years and considering a normal probability distribution for the initial flow rate, with an average of 700 bbl/d (1.29 Â 10 À3 m 3 /s) and a standard deviation of 150 bbl/d (2.76 Â 10 À4 m 3 /s). In Figure 7 it is presented the histogram of the flow rate q. In order to perform a relevant statistical analysis, we sampled 10 000 universes, based on the illustrated histogram. The algorithm solves each one of these universes, recording the obtained solutions for the field layout design problem. The oil field has a maximum output of 160 000 bbl/d(2.94 Â 10 -1 m 3 /s), so the sum of the flow rates of a universe must be below this value, otherwise the universe is generated again.
The coordinates of each well are mentioned in Rosa (2006) and illustrated in Figure 8. There are 22 wells distributed in an oil field of 15 Â 15 squared kilometers with an average water depth of 1.3 kilometers. The terminal which receives the oil production is at coordinate (14, 0.3, 0.02), expressed in kilometers. This instance is composed of satellite wells.
Other case study parameters, as the number of available receivers and their production capacities, the diameter and other tube proprieties, as well as fluid proprieties, are presented in Table 2.
Due to the large quantity of solutions obtained (10 000), they will not be fully reproduced here. Instead, a summary of the results will be presented. Table 3 shows the mean and the standard deviation of the Objective Function (OF) and of the Computational Time (CT) for the 10 000 universes, for each algorithm.  The genetic algorithm presented a mean and a standard deviation much smaller than the greedy algorithm. The pthreads library (Buttlar et al., 1996) allowed the algorithms to process 4 universes simultaneously in the single processor, thus solving the whole problem in a feasible computational time (0.038 s required for the greedy algorithm and 25 min for the genetic algorithm). Using integer linear programming models, Rosa (2006) solved a similar instance with a single universe in 960 min and Rodrigues et al. (2016) in 5.3 min. We must note that the hardwares, operational systems solvers, etc. used in these studies are different, and also that these are a single benchmark. Therefore, it is not possible to directly compare the computational times.
In Figures 9a and 9b it is plotted, respectively, the OF relative gaps and the CT absolute gaps for each universe. By absolute gap, we mean the value obtained by the genetic algorithm minus the value obtained by the greedy algorithm. By relative gap, we mean the absolute gap divided by the value obtained by the greedy algorithm.
Examining Figure 9a, the genetic algorithm usually obtains solutions 30-40% better than the ones obtained in the greedy algorithm. Figure 9b shows that there are small and uncommon CT gaps in this instance.
In order to evaluate if there are more representative solutions for the allocation than others, we determined their frequency, finding that 1238 solutions were found for all universes. We also observed that 82.7% of the solutions indicate that the allocation of the first 15 wells must be to platform 2 or some manifold. Then, the decision that must be made is whether the first 15 wells should be allocated to platform 2 or a manifold, and whether the remaining 7 wells should be allocated to platform 1 or a manifold.
Besides, we also evaluated the solutions for the location subproblem. As already mentioned, the algorithm records the position of the receivers in each universe. After solving all universes, it is possible to plot heat maps and evaluate the regions that receive platforms and manifolds with higher frequency. In Figures 10a and 10b the heat maps of platforms 1 and 2 are respectively presented. The color intensity varies according to the scale of each heat map. The color scheme ranges from blue (lesser density) through white and purple (greater density). The black circles are the wells locations. Both platforms were employed in all universes.
We observe that the platforms have narrow ranges of positions, concentrating especially into the coordinates (12, 4) and (4, 5). Similar to the allocation subproblem, there is a small set of solutions that appear on most universes. Rodrigues et al. (2016) found a solution to one platform the coordinate (4,5) and Rosa (2006) at coordinate (4,4). These coordinates obtained through exact methods also were indicated by the genetic algorithm, pointing to the robustness of the proposed approach.
The heat maps of manifolds 1 and 2 are respectively shown in Figures 11a and 11b. Rodrigues et al. (2016) found a solution to one manifold at the coordinate (4,5) and Rosa (2006) at coordinate (4,4), which are close to a hot region. In this case study, 71.62% of the universes employed at least one manifold, and 24.01% used both manifolds. The probabilistic method proposed here shows there are two regions where the manifold could be installed.

Case study 2: Jubarte
The second case study is based on the Brazilian Jubarte oil field, located 80 kilometers from the coast of Espírito Santo, below a water column of 1.3 kilometers. The instance used here is based on the data presented in Rodrigues et al. (2016). In order to study a situation that only the range of the flow rate of the wells is known, and to allow a comparison with Rodrigues et al. (2016) study, a uniform distribution between 3 000 and 20 000 bbl/d (5.52 Â 10 À3 m 3 /s and 3.68 Â 10 À2 m 3 /s) was used to assign flow rates to each well, for each universe. As in the first case study, there is a maximum field output of 250 000 bbl/d (4.60 Â 10 À1 m 3 /s) of oil.
The coordinates of each well considered here are present in Rodrigues et al. (2016) and illustrated in Figure 12. There are 27 wells distributed in an oil field of 11 Â 11 squared kilometers and an average water depth of 1.295 kilometers. The terminal which receives the oil production is at coordinate (77, 0.3, 0.02), expressed in kilometers. This instance is also composed of satellite wells.
Other case study parameters, as the number of available receivers and their production capacities, the diameter and other tube proprieties, as well as fluid proprieties, are presented in Table 4. Table 5 shows the mean and the standard deviation of the objective function and the computational time for the 10 000 universes, for each algorithm. The integer linear  programming model presented on Rodrigues et al. (2016) took 826.01 s to solve a single universe of this case study. Again, no direct comparison can be done, since hardware, operational system, methods, etc. are different, and also because this is a single benchmark.
In Figures 13a and 13b it is plotted, respectively, the OF relative gaps and the CT absolute gaps for each universe. Examining Figure 13a, the genetic algorithm usually obtains solutions 25% to 30% better than the ones obtained by the greedy algorithm. Compared to the Wilm-Rosa instance, the relative gap between the OF averages has decreased, while the total time required has increased for both algorithms, indicating a loss of performance by the genetic algorithm. Figure 13b shows that there are also small and uncommon CT gaps in this instance.
In order to evaluate if there are more representative solutions for the allocation than others, we determined their frequency. The frequency of each solution of the genetic algorithm is more even between the universes comparing to the first case study, with the 20 most frequent solutions being found only for 5.89% of the universes.
As for the Wilmington-Rosa instance, we also evaluated the solutions for the location subproblem. In Figures 14a  and 14b, it is presented the heat maps of platforms 1 and 2, respectively. Both platforms were employed in all universes.
The density distribution seems to concentrate into the coordinates (3, 8) for platform 1 and (6, 3) for platform 2. Similar to the location subproblem of the first case study, there is a small set of coordinates that appear on most solutions. Rodrigues et al. (2016) found a solution to one platform at coordinate (4,5), which is close to the hottest point of platform 2.
The heat maps of manifolds 1 and 2 are respectively shown in Figures 15a and 15b. Rodrigues et al. (2016) found a solution for one manifold at coordinate (4,5), which is close to the hottest density for both manifolds. The manifolds had a broader probabilistic region for positioning, being  employed at least one in 73.10% of the universes and both in 27.50%. As Rodrigues et al. (2016) state that their results were consistent with reality, we can imply that ours also are.

Case study 3: Stafjord
In order to evaluate the performance of the algorithm with a different well layout, the third case study is based on the Norwegian-British Stafjord oil field, located 180 kilometers from Norway's coast, below a water column of 157 meters. As fluid and reservoir properties were not available to the authors, the instance used the Wilmington probability curves of the first case study except for the parameters of the normal probability distribution for the initial flow rate, which was considered an average of 28 000 bbl/d (5.15 Â 10 -2 m 3 /s) and a standard deviation of 2 000 bbl/d (3.68 Â 10 -3 m 3 /s). The main objective of this case study is to evaluate how the algorithms perform over an oil field composed of clustered wells. The coordinates of each well considered here are present in Norwegian Petroleum Directorate (2016) database and illustrated in Figure 16. There are 69 wells distributed in an oil field of 24 Â 2 squared kilometers and an average water depth of 157 meters. The terminal which receives the oil production is at coordinate (24.92, 2.41, 0.013), expressed in kilometers. There are three well clusters in this instance.
Other case study parameters, as the number of available receivers and their production capacities, the diameter and other tube proprieties, as well as fluid proprieties, are presented in Table 6.  L.P.A. Sales et al.: Oil & Gas Science and Technology -Rev. IFP Energies nouvelles 73, 24 (2018) Table 7 shows the mean and the standard deviation of the objective function and of the computational time for the 10 000 universes, for each algorithm. The genetic algorithm took 3 h and 20 min to solve all 10 000 universes. Although slower, it obtains better solutions than the simpler greedy algorithm.
In Figures 17a and 17b it is plotted, respectively, the OF relative gaps and the CT absolute gaps for each universe. The results show that the genetic algorithm obtains solutions 44-49% better than the greedy algorithm, which indicates that the greedy algorithm has difficulties in obtaining good solutions at a clustered wells scenario. The genetic algorithm required a reasonably greater computational time, compared to the previous two  case studies. This is clearly due to the increased number of receivers and wells. However, for strategical level problems, such as the field layout problem, this computational time is too small in the long term and therefore is not relevant.
In Figure 18 the heat maps for the three platforms of the instance are presented, which all were employed in all universes. Since less than 1% of the instances employed manifolds, we considered that they were not used.
We observe that the platforms have narrow ranges of positions, concentrating especially into the coordinate (23.56, 0.75). This is the optimal place found for all three platforms since it lowers the sum of the pressure loss in the tubes. The use of a bigger platform instead of three platforms should be considered in order to lower investment costs. The real solution employed in the field was placing a platform above each well cluster. We believe that it was mostly due to huge production capacity of the platforms considered, along with economic and operational issues which were not studied in this paper, such as anticipated production, construction availability, environmental regulations, logistics, subsea geology and technology, etc.
As in the first two case studies, representative solutions for the allocation were evaluated. In this case study, each universe had a unique solution, therefore the allocation frequency will not be reported.   The genetic algorithm presented here considered the probabilistic nature of the field layout design problem through a Monte Carlo simulation, and in contrast to the existing literature, it obtains relevant results and optimum solutions to many possible production scenarios. The provided set of instances will allow comparison among future work. Although the greedy algorithm is much quicker, the time required by the genetic algorithm is feasible for its application in the development of oil fields, since few CPU hours are not relevant for strategical planning problems such as the field layout design problem. In contrast to the existing literature, a grid-less approach has proven out to be more suitable in performance terms.
Although exact methods, such as mixed-integer programming models, guarantee to find the optimum solution of a given problem, metaheuristic methods, such as the proposed genetic algorithm, return high-quality solutions, however without an optimality proof. Nevertheless, the time required by an exact method to find the optimum solution to a difficult problem is much greater than the metaheuristic one (Martí and Reinelt, 2011), as exact methods usually employ exhaustive search. Dogru (1987), Grimmett and Startzman (1988), and Hansen et al. (1992) report many numerical difficulties in their exact methods after increasing instance size, and we also noted such difficulties in Rosa (2006) and Rodrigues et al. (2016). If we apply these exact methods to real cases with several dozens of platforms and hundreds of wells using Monte Carlo simulation, it would take months or even years to obtain results for a single run. Even though larger instances would take days to solve, our method would still be computationally feasible. Besides, the computational effort would be small compared to the benefits of such robust analysis of uncertainty, as it enhances the decision making.
We found that a small set of location patterns is responsible for the majority of optimum solutions. Similar conclusions are seen in cutting stock problems, where a small set of cutting patterns are responsible for the majority of the obtained solutions (Araujo et al., 2014). This phenomenon is commonly stated as a Pareto's Law case (Defeo and Juran, 2010).
By analyzing the results, it is possible to state where the wells should be allocated and the demand of platforms and manifolds, as noted in the three case studies. Some results are similar to what was already found in the literature, as the trend to install manifolds below platforms (Rosa, 2006;Rodrigues et al., 2016). However, the stochastic approach goes beyond, quantifying the uncertainty of the optimum location of platforms and manifolds.
Examining Figures 9b, 13b and 17b, we note that some computational time gaps are remarkable. In these universes, the genetic algorithm struggles to generate feasible solutions because of their particular characteristics, for example in allocation or crossover procedures.
Regardless of these high variations, these gaps have a low frequency, and therefore its influence in the total computational time is negligible.
It is important to note that the results presented here should be taken as a support for the decision-making, since there are other factors not studied here that influence the final development project and therefore the field layout design problem, such as economic analyses, production facilities, drilling, and environmental criteria (Morooka and Galeano, 1999).
Compared to traditional knowledge, the genetic algorithm applied here avoids the negative effects of human factors and provides a more scientific, quantitative and probabilistic means for the field layout design problem. Compared to the traditional deterministic methods, the Monte Carlo simulation employed here considers many feasible production scenarios, thus allowing maximum information about the possible outcomes, providing more knowledge about the decision problem and enhancing the decision making. Combined, the uncertainty estimation in Monte Carlo simulation with the high performance optimization of genetic algorithms provided a robust method for solving the field design problem considering uncertainties. The coupling of both methods combined their perks, and thus they are equally important to attain optimized solutions.
The following ideas are suggested as future enhancements for this work: (i) studying the possible trend of the allocation solutions increasing uniqueness as the instance size grows, (ii) add the subproblem of optimizing the diameter of the pipes, (iii) insertion of a broader model for the coupling between the wells and the reservoir, (iv) insertion of a multiphase flow model to consider water and gas phases in pressure loss calculations, (v) estimate well flow rates according to the well type (horizontal, vertical or directional) and completion, and (vi) a multi-objective approach to the problem. Nomenclature p pressure inside the tube r density of the fluid in the tube u average fluid velocity in the tube W s shaft work realized in the system g gravitational constant f f Fanning friction factor L length of the tube D internal diameter of the tube m fluid viscosity in the tube k rock permeability h reservoir net pay Re Reynolds number e relative roughness a decline rate q 0 initial flow rate r w well radius r e well drainage radius N i initial oil in place in the well drainage radius c t total compressibility of the reservoir 14 L.P.A. Sales et al.: Oil & Gas Science and Technology -Rev. IFP Energies nouvelles 73, 24 (2018) s well skin factor w number of wells in the oilfield m number of available manifolds r number of receivers