GPU-accelerated hydraulic simulations of large-scale natural gas pipeline networks based on a two-level parallel process

The numerical simulation efficiency of large-scale natural gas pipeline network is usually unsatisfactory. In this paper, Graphics Processing Unit (GPU)-accelerated hydraulic simulations for large-scale natural gas pipeline networks are presented. First, based on the Decoupled Implicit Method for Efficient Network Simulation (DIMENS) method, presented in our previous study, a novel two-level parallel simulation process and the corresponding parallel numerical method for hydraulic simulations of natural gas pipeline networks are proposed. Then, the implementation of the two-level parallel simulation in GPU is introduced in detail. Finally, some numerical experiments are provided to test the performance of the proposed method. The results show that the proposed method has notable speedup. For five large-scale pipe networks, compared with the well-known commercial simulation software SPS, the speedup ratio of the proposed method is up to 57.57 with comparable calculation accuracy. It is more inspiring that the proposed method has strong adaptability to the large pipeline networks, the larger the pipeline network is, the larger speedup ratio of the proposed method is. The speedup ratio of the GPU method approximately linearly depends on the total discrete points of the network.


Introduction
With the increasing demand for clean energy resources worldwide, natural gas plays a more important role than previously in the energy structure [1,2]. Pipeline networks are the most common means of natural gas transportation. They are thus being constructed on a large scale in recent years [3]. The scale of natural gas pipeline networks is becoming increasingly larger and the topological structure is becoming increasingly more complex [4]. Prediction of the flow parameters in a natural gas pipeline network is important to managers who create the operating schedule and forecast potential risk [5,6]. Hydraulic simulation is the most common way to analyze the flow in a gas pipeline network [7,8]. As early as the 1950s, hydraulic simulation for gas pipelines began with some simplified mathematical models [9,10]. Some terms in the governing equations of pipeline flow, such as the inertia term, the transient term or the friction term, are usually ignored in these models. In later years, some numerical methods were proposed to solve the non-simplified mathematical model, such as the method of characteristics [11], the method of lines [12], and the finite difference method [13]. Among these methods, the implicit finite difference method is one of the most commonly used methods because of its high stability and accuracy.
However, because all the equations should be solved simultaneously at each time step, there is a large computational cost for the implicit finite difference method. To solve this issue, studies have been done to accelerate the computation. Luongo [14] proposed a novel linearization method for the partial differential equations in a gas pipeline, which can save 25% of the computational time. Behbahani-Nejad and Shekari [15] presented a reduced-order modeling approach based on the implicit finite model for natural gas transient flow in pipelines, showing a 71.14% time saving compared with the direct simulation of the origin nonlinear model. To further improve the efficiency of the implicit finite difference method, an adaptive implicit finite difference method for natural gas pipeline transient flow was introduced [16]. The results show that the calculation speed of the slow transient simulation is accelerated and the computation accuracy of the fast transient simulation is high. What is more, the Decoupled Implicit Method for Efficient Network Simulation (DIMENS) method was proposed in our previous study [17,18]. DIMENS is a rapid method for the hydraulic simulation of natural gas pipeline networks. In DIMENS, the pipelines in the networks are decoupled and solved independently. Comparison with the well-known commercial simulation software SPS, the speedup ratio is 2.5. Although these methods improve the efficiency of the hydraulic simulation of gas pipeline networks, simulation efficiency is lacking.
In recent years, Graphic Processing Unit (GPU) acceleration has attracted considerable research interest in scientific computing. Many studies in various fields, e.g., fluid simulation [19][20][21], weather prediction [22,23], seismic damage simulation [24], and sheet forming simulation [25], show great performance with GPU computing. GPU computing is the use of a GPU as a coprocessor to accelerate CPUs for general-purpose scientific and engineering computing. Compared with a CPU, on a single PC, GPU uses thousands of smaller and more efficient cores for a massively parallel architecture aimed at handling multiple functions at the same time. It provides superior processing power, memory bandwidth and efficiency over its CPU counterparts.
Obviously, if GPU technology is introduced into the simulation of natural gas pipeline networks, the computational efficiency would be improved substantially. However, to the best of authors' knowledge, it is hard to see any literature focusing on the GPU-accelerated simulations of natural gas pipeline networks due to the lack of suitable parallel algorithm. Fortunately, the DIMENS method in our previous work [17,18] is a highly parallel algorithm. The core idea of this method is that the pipelines are first decoupled and then solved independently, thus the parallel solution process is very suitable for GPU computing. In other word, with the combination of proposed DIMENS method with GPU technology, the computational speed of natural gas pipeline networks simulation would be greatly accelerated. Therefore, to further improve the simulation efficiency of natural gas pipeline networks, our DIMENS method is extended to a two-level parallel process, and then it is implemented on GPU. A novel two-level parallel simulation process and the corresponding parallel numerical method for hydraulic simulation of natural gas pipeline networks are proposed. The proposed method can effectively improve the simulation efficiency of large-scale natural gas pipeline network, and can be used to guide decisions regarding the design and operation of the pipeline network system.
The outline of this paper is the following. In Section 2, the mathematical model and the DIMENS method are briefly introduced and analyzed. In Section 3, the idea of the two-level parallel process and the corresponding parallel numerical method for the hydraulic simulation of largescale networks are proposed. In Section 4, the implementation of the two-level parallel process with GPUs is presented in detail. In Section 5, numerical experiments are presented to verify the accuracy and efficiency of the proposed GPUaccelerated method. The conclusions are summarized in Section 6.

Mathematical model and the DIMENS method
In this section, the mathematical model for a gas pipeline network and the DIMENS method are reviewed. For a brief and clear description, a network that includes only the pipeline, the supply and the demand, are studied. The mathematical model of complex networks, which also includes a compressor and a valve can be found in the study [17,18].

Pipeline model and discretization
According to our previous study, the hydraulic governing equations of a gas pipeline, incorporating the continuity equation and momentum equation, is written as a general formula [17,18] as follows: The hydraulic equation is a nonlinear partial difference equation, and its direct solution is usually unstable and inefficient [7,26]. Therefore, equation (1) is linearized as equation (2) based on the Taylor expansion [27] as follows: where, the elements (i, j) of matrices G and S are G ½ i;j ¼ P l¼2 The implicit finite difference method is adopted to discretize equation (2) on a uniform pipe mesh, as shown in Figure 1. The space grid and time grid are divided horizontally and vertically, and Dx and Dt denote the space step and the time step, respectively.

Boundary conditions
The supply and demand are referred to as the gas source. The mathematical model of the gas source is shown in equations (8)- (9). They are also called the external boundary conditions of pipeline network: At the junction node of the components, the laws of mass conservation and pressure balancing should be satisfied. The mathematical model of a junction node is shown in equations (10)- (11), which are also called internal boundary conditions, as follows: Equations (7)- (11) are the hydraulic discretized equations of a natural gas pipeline network. An Equation Of State (EOS) which expresses the density in terms of pressure and temperature needs to be close to these equations. Several different EOSs, such as Peng-Robinson (PR) and Soave-Redlich-Kwong (SRK), are commonly used in natural gas industry [30]. Solving this system of hydraulic discretized equations, the pressure and mass flowrate in the pipe can be obtained.

DIMENS method
The DIMENS method is proposed based on the idea of divide-and-conquer in our previous study [17,18]. In the DIMENS method, the gas pipeline network is divided into several pipelines and a set of junction nodes, then the equations of the pipeline and those of junction nodes are solved independently. In this way, the system of origin discretized equations is split into two parts: the system of discretized equations for the pipe in equation (7) and the system of boundary equations in equations (8)- (11). In this paper, we call them PDES and BES, respectively. There are as many PDESs in the system of discretized equations as there are pipelines in a network.
To describe the decoupling of the DIMENS method, a simple pipeline network of 9 nodes and 12 pipes is used as an example, as shown in Figure 2, where red circles represent nodes and black lines represent pipes. The blue points indicate the spatial discretized points of the pipe. Using the DIMENS method, the simple network is divided into 9 nodes and 12 pipes. Using this method, one set of original equations with 9 + 12N size is divided into one BES with 9 size and twelve block PDES with N size.
The solution process of the PDES and the BES using the DIMENS method can be shown in Figure 3. More details about the DIMENS method can be found in the literature [17,18]. In the DIMENS method, the gas pipeline network is divided into several pipelines and a set of junction nodes, which can be simulated separately. The evidence in Figure 3 shows that the solution process of the PDESs in both step 1 and step 3 are separate. These analyses show that the DIMENS method is a highly parallel simulation method. However, the DIMENS method in our previous study [16,17] is a single process. The advantages of the DIMENS method in simulation efficiency has not been fully utilized. Parallelization of the DIMENS method is an effective method to further improve the simulation efficiency in pipeline networks.

Two-level parallel simulation process and numerical method
In order to implement our DIMENS method on the GPU, the DIMENS method is extended to a novel two-level parallel simulation process and the corresponding parallel numerical method is introduced in detail in the following sections.

Two-level parallel simulation process
Because of the lightweight GPU computing, the parallel granularity should be fine enough to maximize the GPU utilization. Therefore, the two-level parallel simulation process is designed to be coarse and fine level processes. In the coarse level process, the task of network simulation is first divided into several solution tasks for the PDESs and the BES. In the fine level process, the solution of the PDES or the BES is then divided into parallel arithmetic operations to minimize computational effort.

Coarse level parallel process
In DIMENS, the solution process of the PDESs is separate; therefore, each PDES can be treated as a subtask. Because  the junction nodes and gas sources are independent of the pipe, BES can be similarly treated as another subtask. That is, in the coarse level parallel process, the task of network simulation is divided into several solution tasks for PDESs and BES, the solution process of a PDES or a BES is considered a subtask. Figure 4 shows a sketch of the coarse level division of a simple network. The sketch shows a simple network divided into several pipelines and a set of junction nodes. The task of network simulation is divided into twelve solution tasks for the PDESs and one solving task for the BES.

Fine level process
There are usually many discretized points in one pipeline and many junction nodes in a network; therefore, the solving task for PDES or BES is heavy even when using GPU, which could result in low computational speed. To solve this problem, fine level patterns for division of the above tasks are proposed. In the fine level process, the solution task for the PDES or the BES is divided into parallel arithmetic operations, and each arithmetic operation is a subtask.
1. Solution task for the PDES.
In our previous study [17,18], the PDES, expressed as equation (7), are transformed into the block tridiagonal equations in equation (12) as follows: where, A i , B i , C i and D i are equation coefficients, which are all 2 Â 2 matrices; X i are the fundamental and particular solutions of equation (7); and subscript i represents the ith discretized point. The PDES solution is the main computational stack of DIMENS. Therefore, parallelizing the solution process of equation (12) would greatly improve the computation speed. In our previous study [17,18], equation (12) was solved by the block TDMA method. The TDMA method is a Gaussian-elimination-based method, which is usually difficult to parallelize [31] on a GPU.
In this study, the Parallel Cyclic Reduction (PCR) method [32] is adopted to parallelize the PDES solution. The PCR method was initially designed to solve scalar tridiagonal equation systems by Hockney and Jesshope in 1981. In this study, it is extended to solve the vectorial equation system in equation (12). The subtask of solving the PDES is the reduction and solution step of PCR. First, the basic parallel process of PCR is briefly introduced, then the parallel solution process of the PDES solution is described in detail in Section 3.2.2.

Solution task for BES.
The solution of the BES, which also needs parallelization with fine granularity, is the other important part of the computational stack in DIMENS. The hydraulic variables of the start and end points of all pipelines are arranged by the number of pipelines. Then, BES can be written as follows: where, subscript (i, j) represents the ith point of the jth pipeline. Because of the sparsity and irregularity of the BES, a modified Conjugate Gradient (CG) algorithm was adopted to solve it with high efficiency in our previous study [17,18]. Because of the large number of matrix-matrix and matrixvector multiplications in the CG method, it is readily parallelized by assigning a few steps of the arithmetic operations to each computing core. The subtasks in solving the BES are the arithmetic operations.

Basic process of PCR
In PCR, the one system of original equations is reduced to a system of twice the number of reduced equations, which has only half the number of unknowns. The reduction procedure does not stop until each equation system has at most 2 unknowns. The equations of 2 unknowns are readily solved. Figure 5 shows the reduction pattern of the PCR method in a simple case 8 equations. Letters e 0 and e 00 represent the updated equations. Equations in a yellow rectangle form a system. The parallel process is introduced briefly in the following.
Step r : The 1st reduction is performed by reducing the even-indexed and odd-indexed unknowns, reducing the one system of 8-unknown equations to two systems of 4-unknown equations.
Step s : The 2nd reduction is performed by the same operation, reducing the two systems of 4-unknown equations to four systems of 2-unknown equations.
Step t : The four systems of 2-unknown equations are readily solved. There are reduction and solution processes in the PCR method.
The system of 8-unknown equations is thus solved in 3 steps using the PCR method. In addition, the process for each step in PCR is parallel. Therefore, when the PCR method is extended to solve the PDES, i.e., equation (12), they can be solved by N + 1 logical computing cores in only steps. In this way, the computing task for the PDES can be lightweight and efficiently executed with GPU.

PCR method for PDES solution
According to the basic parallel process of PCR, the parallel solution process of the PDES solution can be divided into two steps: the reduction step and the solution step. These two steps are described in detail in the following.

Reduction step:
For the one system in equation (12), all even-indexed and odd-indexed equation coefficients in parallel with index i are updated as a linear combination of equation coefficients i, i + 1 and i À 1, obtaining the kth reduced system shown in equation (13). The number of equations in the system represented by equation (13) is N þ1 2 k , where the number of equation (13) systems is 2 k : For the internal equation in each equation (13) system, i.e., 2 k < i < N + 1 À 2 k , k = 1, 2, . . ., N st À 1, the equation coefficients are rewritten as follows: For the first equation in each equation (13) system, i.e., i = 2 k , the equation coefficients are rewritten as follows: For the last equation in each equation (13) system, i.e., i = N + 1 À 2 k , the equation coefficients are rewritten as follows: The superscripts k À 1 and k represent the (k À 1)th and the kth reduced systems, respectively.

Solution step:
When k = N st À 1, the number of equations in the equation (13) system is 2. The two equations of the (N st À 1)th reduced system are as follows: Equations (10) and (11) can be readily solved and X i and X iþ2 k are obtained as follows: where k = N st À 1, i ¼ 1; 2; Á Á Á N þ1 2 . Equation (13) is the reduction process, and equations (16)- (17) are the solution process. The three equations compose the computing task in the fine level process of the PDES solution. There are several steps of 2 Â 2 matrix multiplication for each discretized point, which is a series of arithmetic operations. Therefore, the computing task is lightweight and equation (12) can be solved efficiently on GPU.

Flow diagram of the two-level parallel simulation process
Now, we establish a two-level model for the division of the computing task. Figure 6 shows a flow diagram of the twolevel parallel simulation process.

Implementation of the two-level parallel simulation process on GPU
In this section, the implementation of the two-level parallel simulation process on GPU is described. In contrast with CPU programming, programming for GPU must consider the relation between the code and the hardware, especially thread mapping, memory organization and memory access. The implementation of these three parts are discussed below.

Thread mapping model
To manage and dispatch the GPU threads as required by the algorithm, i.e., establishing a correct thread mapping model, we need a developing tool such as OpenCL or the Compute Unified Device Architecture (CUDA) to aid in the GPU programming. Between the two tools, CUDA is the most widely used GPU programming environment based on C/C++ for GPUs produced by NVIDIA Corporation [33]. As shown in Figure 7, CUDA uses a two-level thread mapping model, i.e., the grid level and the thread block level. Host and Device indicate CPU and GPU, respectively. Kernel represents the part that can be parallelized in the problem; it is launched by CPU and executed on GPU. Each Kernel has a grid structure, consisting of multiple thread Blocks, and each thread Block also consists of multiple threads with a grid structure. As shown in Figure 7, GPU is a double-level parallel simulation process, which perfectly integrates with the two-level parallel simulation process proposed in this paper. In general, the coarse-level and fine-level parallel computing tasks would be mapped to grid and thread blocks, respectively. Thread blocks are executed concurrently on GPU, and threads in a same block are also executed concurrently. Therefore, there are two thread mapping models: the first model is the DIMENS method, which is the coarse-level thread mapping model; and the second model is for solving PDES and BES, which is the fine-level thread mapping model.
Since the pipes are decoupled by DIMENS, the coarselevel thread mapping model is readily constructed, as shown in Figure 8. In this model, each thread block is responsible for solving the PDES of the corresponding pipe.
In the fine-level thread mapping model used to solve the PDES, the computing tasks for each discrete point, i.e., equations (13), (16) and (17), are mapped to a single thread in the corresponding thread block. Figure 9 is a sketch of the thread mapping model.

Step 1
Reduce to two systems of

4-unknown equations
Step 2 Step 3 Reduce to four systems of 2-unknown equations Solve the four system of 2-unknown equations respectively With respect to solving the BES, the thread mapping model is developed for matrix-matrix and matrix-vector multiplications. Fortunately, CUDA provides abundant libraries that have highly optimized functions for linear algebra, such as cuSPARSE, cuSOLVER, and cuBLAS in CUDA Toolkit Documentation v7. 5. In this study, the cuSPARSE library is used to accelerate the CG method for solving the BES.

Memory organization
Memory organization is the way of distributing different data to different types of memory on the GPU. There are five primary types of memory on GPU, i.e., registers, shared memory, global memory, constant memory and texture memory. Figure 10 shows the memory structure on GPU, where the data stored in Register and Shared Memory are private for thread and thread block, respectively. Data stored in Global, Constant and Texture memory are public for all threads and thread blocks. Table 1 demonstrates the access latency of different types of memory.
Although registers and shared memory have excellent performance compared with the other types of memory, there are very limited resources for registers and shared memory on GPU. Therefore, the memory organization should be carefully designed such that those frequently used data can be stored in high-speed memory to increase the cache hit rate. Table 2 shows the memory organization for solving PDES on GPU.

Memory access
In addition to an appropriate memory organization, coalesced access for global memory is also required to avoid repetitive access with low efficiency [27], i.e., data locality is very important. Some APIs provided by CUDA, e.g., cudaMallocPitch, have been developed to support coalesced access for data in global memory.
The performance of the above optimization is tested with different numbers of discrete points in the pipe, i.e., different size of PDES. Figure 11 shows a great   improvement in computing performance using high-speed memory and coalesced memory accessing. All the numerical experiments conducted in this study are based on the hardware configuration introduced in Section 5. The data in Figure 11 show that in these four cases, the speedup ratio for using both optimization methods are 5.67x, 4.00x, 3.83x and 3.57x, respectively.

Case study
In this section, experiments are provided to test the performance of the proposed GPU simulation of a large-scale gas pipeline network, short as GPU method. To illustrate the efficiency of GPU acceleration, the computing speed increase is compared with that of the DIMENS method which implemented on CPU by sequential FORTRAN program and with the widely used commercial software Stoner Pipeline Simulator (SPS) in these experiments. In this paper, a PC with E5-2640 V3 CPU and GTX 980 GPU is used for the experiments and the CUDA version is 7.5. 5.1 Small-sized nature gas pipeline network case

Description of pipeline network
First, to evaluate the accuracy of the proposed GPU method, a small-sized gas pipeline network is tested. Figure 12 shows the topological structure of a gas gathering pipeline network in a particular oil field [34]. The pipeline network comprises 73 pipes, 70 junction nodes and 52 boundary nodes. The structure data and parameters of pipes are presented in the Appendix Table B1. The boundary conditions and the components of the natural gas are listed in the Appendix Tables B2 and B3, respectively.

Comparison of numerical accuracy
The steady-state simulation results computed by SPS software, DIMENS and the proposed method are shown in Tables 3 and 4. The total simulation time is 120 h. The time step is Dt = 6 s and the spatial step Dx = 600 m, based on which the grid independent numerical solution can be obtained. It should be noted that: (1) the flowrate of the junction nodes represents the net value; (2) the max error represents the maximum value of the error between SPS software and GPU method and that between DIMENS and GPU method.
It can be seen that the maximum relative error of flowrate obtained by the three methods is 4.15% at the junction node 32, as shown in the Table 3; the maximum relative error of pressure obtained by the three methods is 2.88% at the end node of pipe 1, as shown in the Table 4. The maximum relative error is acceptable in numerical simulation and engineering. This comparison indicates good precision of the proposed GPU accelerated method.

Comparison of computing speed
The computing acceleration of the GPU method is compared with that of the SPS software and the DIMENS method with different spatial steps Dx. The results are shown in Table 5.
It can be seen that the SPS software is most timeconsuming. The computing time of the DIMENS method is about half of that of SPS software. For example, when the spatial step Dx = 400 m, the computing time of SPS software and DIMENS method are 5848.29 s and 2735.21 s, respectively. It means that the computational speed of DIMENS method is about 2 times faster than the SPS software, which is consistent with the conclusion of our previous study [11]. Additionally, the computing time of GPU method is much less than those of DIMENS method and SPS software. The computing time of GPU method is about 1/10 and 1/5 of those of the SPS software and DIMENS method. For example, when the spatial step Dx = 400 m, the computing time of the GPU method is 516.62 s, which is about 1/10 and 1/5 of 5848.29 s and 2735.21 s. It indicates that the computational efficiency of GPU method is about 10 and 5 times than those of SPS software and DIMENS method, respectively. What is more, the speedup ratio of GPU method vs. SPS increases from 5.73 to 12.63 as spatial step decreases from 600 m to 200 m, which means the speedup ratio increases as the computation scale increases. It indicates that the GPU method has strong adaptability to the simulation of the pipeline network.

Large scale natural gas pipeline network cases
In Section 5.1, the calculation accuracy and computing speed of the GPU method have been verified in the small  scale pipeline networks. In this subsection, the calculation speed is further tested with large gas pipeline networks.
With the pipeline network shown in Figure 12 as the basic pipeline network unit, we have five pipeline networks, named No. 1 to No. 5, which contain different number of network units and therefore have various sizes. The topological graphs of these five gas pipeline networks are too large to demonstrate in this paper, thus only the component data are shown in Table 6. The main purpose of designing these five pipeline networks is to verify the applicability of the proposed method to the large-scale pipe network. These five networks cases are all transient simulations cases. The simulation time is 1 day, the mesh spatial size and time step are Dx = 500 m and Dt = 30 s, respectively.
The computing time of the three methods in the five large pipeline networks is shown in Figure 13. It can be seen from Figure 13 that the computing time of SPS software is the largest one, while that of GPU method is the smallest one. It suggests that the simulation efficiency of the GPU method is high. Additionally, the computing time of the SPS software and DIMENS method linearly depends on the total discrete points, while the computing time of GPU method is almost stable. In other words, when the total discrete points increase, the computing time of the SPS software and DIMENS method linearly increases, but that of the GPU method almost keeps constant. Thus, the GPU method has high computational speed and strong adaptability to the large pipeline networks.
The comparisons of computing time of the three methods in the five large pipeline networks are shown in Table 7. The speedup ratio of DIMENS vs. SPS is about 2 in those five large networks, which is consistent with the data in Table 5. It is to say that the computational speed of DIMENS method is also about 2 times faster than the SPS software for the large pipeline network. This proves that DIMENS method is very suitable for large pipe  network, which is consistent with the conclusion of our previous study [17]. In contrast, the speedup ratio of GPU is notable. The minimum speedup ratio of GPU method vs. DIMENS and GPU method vs. SPS are respectively 7.93 and 17.24, which are much larger than 2. In the case of GPU method vs. SPS, the maximum speedup ratio is even up to 57.57. It further confirms that the computational speed of the GPU method is fast. The speedup ratio of the GPU method in the five large pipeline networks is shown in Figure 14. It can be seen that, the speedup ratio of GPU method vs. SPS and GPU method vs. DIMENS increase as the total discrete points of pipeline network increase. The speedup ratio of the GPU method approximately linearly depends on the total discrete points. It implies that the larger the pipeline network is, the larger speedup ratio of the GPU method is. It further confirms that the GPU method has strong adaptability to the simulation of the pipeline network, especially for large pipeline network.

Conclusion
To further improve the simulation efficiency of natural gas pipeline networks, a novel GPU-accelerated hydraulic simulation was study in this paper. First, based on the DIMENS method in our previous study [16,17], a two-level pattern for division of the computing tasks and the corresponding parallel numerical method were proposed. Then the thread mapping, memory organization and memory access of GPU were carefully designed, and the two-level parallel process was implemented on the GPU. Last, one small and five large nature gas pipeline networks were presented to verify the accuracy and efficiency of the proposed GPU method. The following conclusions can be drawn: 1. The proposed GPU method has high accuracy. The accuracy of the result obtained by the GPU method is almost the same with the DIMENS method and the SPS software. In the small nature gas pipeline network case, the maximum relative error of results obtained by the above three methods is only 4.15%. 2. The proposed GPU method has notable speedup. In the five large-scale pipe networks cases, compared with the well-known commercial simulation software SPS, the speedup ratio of the proposed method is up to 57.57 with comparable calculation accuracy.
3. The proposed method has strong adaptability to the large pipeline networks. In the five large-scale pipe networks cases, the larger the pipeline network is, the larger speedup ratio of the proposed method is. The speedup ratio of the GPU method approximately linearly depends on the total discrete points of the network.