A Review of Kinetic Modeling Methodologies for Complex Processes

— In this paper, kinetic modeling techniques for complex chemical processes are reviewed. After a brief historical overview of chemical kinetics, an overview is given of the theoretical background of kinetic modeling of elementary steps and of multistep reactions. Classic lumping techniques are introduced and analyzed. Two examples of lumped kinetic models (atmospheric gasoil hydrotreating and residue hydroprocessing) developed at IFP Energies nouvelles (IFPEN) are presented. The largest part of this review describes advanced kinetic modeling strategies, in which the molecular detail is retained, i.e. the reactions are represented between molecules or even subdivided into elementary steps. To be able to retain this molecular level throughout the kinetic model and the reactor simulations, several hurdles have to be cleared ﬁ rst: (i) the feedstock needs to be described in terms of molecules, (ii) large reaction networks need to be automatically generated, and (iii) a large number of rate equations with their rate parameters need to be derived. For these three obstacles, molecular reconstruction techniques, deterministic or stochastic network generation programs, and single-event micro-kinetics and/or linear free energy relationships have been applied at IFPEN, as illustrated by several examples of kinetic models for industrial re ﬁ ning processes. Résumé article,


INTRODUCTION
When dealing with chemical reactions, thermodynamics allows us to know whether a reaction is possible, and to calculate its chemical equilibrium constant. Hence, for a given chemical reaction under given experimental conditions and with a known initial composition of the mixture, thermodynamics is able to predict its equilibrium composition, the final steady state. However, thermodynamics does not say anything concerning the rate at which a reaction proceeds towards equilibrium, nor the pathway that is followed, i.e. the dynamics that lead to that final steady state.
The study of the dynamics of reactions is the field of chemical kinetics, namely the study of the factors that determine the reaction rates and that are responsible for the establishment of chemical equilibria in reversible reactions. The goal of chemical kinetics is therefore twofold: determining the overall pathway, the reaction network, the reaction mechanisms and the various intermediates, and deriving quantitative rate equations (with their rate parameters) which predict the rate of reaction as a function of the local conditions (temperature, pressure, phase state, composition, ionic strength, pH, solvent, catalysts, etc.). The formulation of the reaction mechanism used for the kinetic modeling is always subject to discussion and should be based on all available physical measurements, including surface science data, chemical knowledge, and kinetic experiments (preferably far from equilibrium). Hence, chemical kinetics needs to combine knowledge acquired through various other disciplines, such as analytical chemistry (composition), organic chemistry (reaction types), physical chemistry (structure of matter), classical thermodynamics (equilibrium), statistical thermodynamics and quantum mechanics (rate theories), spectroscopy and computational chemistry (reaction intermediates), and mathematics (parameter estimation and process simulation), among others. By predicting the rates of the various reaction pathways, chemical kinetics allows the prediction of production rates and selectivities, and is therefore a necessary tool in the modeling and design of chemical reactors. Hence, chemical kinetics is one of the pillars of the chemical engineering discipline.
Chemical kinetics is a discipline that quantitatively describes the progress of reactions on a large range of different scales: from interactions between atoms and electrons in chemical bonds, to production rates in chemical reactors. Kinetics on the level of individual molecules describes the modifications that the reactants undergo to form the products. This is often referred to as reaction dynamics or molecular dynamics, and is nowadays an integral part in the domain of chemical kinetics. In these studies (molecular beam experiments, laser spectroscopy, ab initio theoretical chemistry, transition state theory, etc.), the molecular structures that exist along the reaction pathway on the Potential Energy Surface (PES) are analyzed, directly linking reactivity to quantum mechanics. Kinetics on large ensembles of molecules describes the average behavior of the reactants, linking reactivity and the extent of reaction to the operating conditions (temperature, composition, etc.) and thermodynamics.
Chemical kinetics is a relatively young science. Although descriptive kinetics already started just before 1800, with publications from Karl Friedrich Wenzel in 1777 and from Claude Louis Berthollet in 1803, the first known study of quantitative chemical kinetics dates back to 1850, when German physicist Ludwig Ferdinand Wilhelmy published a study on the acid hydrolysis of sucrose into glucose and fructose in a batch reactor, establishing for the first time an empirical relation between the reaction rate and the concentration of the reactants. In his rate equation, the reaction rate was proportional to the concentrations of sugar and acid, while his proportionality constant was an exponential function of the inverse of the absolute temperature. Moreover, he proposed a differential equation to describe the decrease in sugar concentration over time, thereby writing down a differential equation that was probably the first differential equation in chemistry and one that is still used today for elementary bimolecular reactions in closed systems.
In their 1862 publication on the esterification between acetic acid and ethanol, French chemists Pierre Eugène Marcellin Berthelot and Léon Péan de Saint-Gilles showed that some reactions cannot proceed to total conversion and they had to introduce into their empirical rate equation a limiting factor, which was later shown to be the equilibrium constant. By reexamining these results and interpreting their own experimental data through the laws of classical mechanics, two Norwegian scientists, mathematician Cato Maximilian Guldberg and chemist Peter Waage, made a significant breakthrough by formulating the mass action law, one of the basic laws in chemical kinetics. In their 1864 publication, they introduced the equilibrium mass action law, and formulated the form of the rate equations. They arrived at this form through the laws of classical mechanics: they postulated that molecular collisions have to precede the actual reaction, and proposed that reaction equilibrium is a balance of two opposing 'affinity forces', one due to the reactants and one owing to the products. For the esterification reaction, they stated that, at equilibrium, the two opposing 'affinity forces' (i.e. the reaction rates) had to be equal, and that each 'affinity force', in analogy to the law of gravity, could be expressed as the product of the 'affinity coefficients' (i.e. rate constants) and the 'action masses' (i.e. concentrations) of the two reacting species. In their later work, they generalized these mass action laws to any number of reactants, both for irreversible and reversible reactions of any molecularity, by explicitly using the stoichiometric coefficients of the reaction in the rate equation, leading to the publication of the mass action law in its final form in 1879.
In the same period, at the University of Oxford, chemist Augustus George Vernon Harcourt sought the help of mathematician William Esson. In their papers from 1865 and 1866, they demonstrated that the reactant concentrations decrease logarithmically with time. Therefore, the rate of reaction does not remain constant, but depends on the concentration of the reactant. They were also the first to treat the kinetics of consecutive reactions. In their 1867 paper, the time dependent evolution was interpreted in terms of differential equations. In their meticulous and systematic study of the concentration dependence of reaction rates, the reaction rate was assumed to be proportional to the original amounts of the reactants and a constant depending on the considered system, which they baptized rate constant k. In their studies on the temperature effect, they also postulated that this rate constant has to become zero at absolute zero, since the reactants no longer move and cannot interact or collide. This, together with their comparison of a chemical reaction to the fall of bodies, where potential energy is transformed into kinetic energy, illustrates that Harcourt and Esson started the introduction of thermodynamics into chemical kinetics.
The next major event arrived in 1884, when Dutch scientist Jacobus Hendricus van 't Hoff published the first monograph on chemical kinetics under the title 'Études de dynamique chimique'. In his book, the author compiled, generalized and further developed the work on chemical kinetics. He described a graphical method for determining the order of a reaction in kinetic studies, studied the influence of the solvent for reactions in solution, introduced the modern concept of chemical affinity to replace the classical mechanical representation, classified reactions in terms of the number of interacting species (which is now referred to as their molecularity), derived kinetic rate equations for mono-and bi-molecular reactions, and considered polymolecular processes as a sequence of mono-and/or bimolecular steps. He derived the van 't Hoff relation for the dependence of the equilibrium constant on the absolute temperature, and was also the first to correctly formulate the effect of temperature on the rate constants, although he did not succeed in finding a theoretical interpretation for the various terms. His essential contribution lies in the link he made between the equilibrium constant of a reaction, and the rate constants for the direct and the reverse reaction, thereby formally connecting thermodynamics and kinetics.
In Baltic Germany, Friedrich Wilhelm Ostwald worked on catalysis, electrochemistry, chemical equilibria (Ostwald's dilution law) and reaction rates. In 1887, Ostwald derived the kinetics of autocatalytic reactions, introduced the terms 'half-life' and 'reaction order', and worked on ways to determine the reaction order. During the same period, he also coined the word 'mole', linked it to ideal gases, and defined it as the molecular weight of a substance in mass grams. Ostwald also wrote the famous two-volume textbook 'Lehrbuch der Allgemeinen Chemie', in which he included his developments on catalysis and chemical kinetics. His most important contributions consisted of formulating that each reaction exists and proceeds independently, interacting only on the level of a material balance (principle of independence of chemical reactions), that catalysis was nothing but a kinetic phenomenon, where a foreign compound accelerated the kinetics by lowering the energy barrier, and hence, that a catalyst cannot change the equilibrium of a reaction.
Based on the observation that reactions accelerate when heat energy is added, and on the formulae derived by van 't Hoff, Swedish scientist Svante August Arrhenius postulated in 1889 that, in order for a reaction to proceed, an energy barrier has to be overcome before two molecules will react. As did van 't Hoff previously, he claimed that the temperature was not the reason for reaction, but that it is responsible for the changes in its rate. In the same work, Arrhenius also had an important additional insight: he postulated that this temperature dependence indicates the existence of an 'activated molecule', whose concentration is proportional to the total concentration of reactant molecules, but is exponentially temperature dependent. To satisfy the latter conditions, he proposed an equilibrium between normal and activated reactant molecules. By introducing the concept of an 'activated state', Arrhenius can therefore be considered the father of all reaction rate theories. He then expressed the temperature dependence of the reaction rate through the rate constants in his famous Arrhenius equation (while crediting van 't Hoff for the origin of his equation), simultaneously coining the term 'activation energy' and introducing its symbol, E. Both the pre-exponential factor (or frequency factor) and the activation energy are considered to be independent of temperature. Arrhenius also provided a molecular level interpretation of the exponential factor as the fraction of collisions with sufficient energy, but his equation remains an empirical equation, however, as the pre-exponential factor is not explained at all.
In 1905, German physicist Walther Hermann Nernst established what is now called the Third law of thermodynamics (which describes the behavior of matter as the temperature approaches absolute zero, and postulates that the entropy of a perfect crystal at absolute zero is exactly equal to zero). He described the calculation of chemical affinities, provided for the first time a means of determining free energies and hence equilibrium states of chemical reactions from heat measurements, and allowed the unambiguous calculation of entropies.
In Germany, at the end of the 19 th century, Max Ernst August Bodenstein studied the bromination of hydrogen to produce hydrogen bromide in conditions where the reaction was not limited by equilibrium. Yet, he found that the rate of the reaction was limited by its product. Without knowing, Bodenstein had discovered a free-radical chain reaction with two propagation steps. Bodenstein also showed that Arrhenius' law is only applicable to elementary reactions and that overall reactions often show deviations. Later on, in 1913, Bodenstein studied the chlorination of hydrogen and observed that absorption of a single photon could yield up to 10 6 molecules of HCl, clearly indicating that this reaction could not be a single step reaction. This behavior in various halogenation reactions was explained in 1918 by Walther Hermann Nernst, who conceived the theory for chain reaction mechanisms that was further developed by Bodenstein. In his 1913 paper, Bodenstein also postulated that chemical reactions follow a reaction mechanism consisting of several steps, in which reaction intermediates are present in inferior amounts, that the rate of change of their concentrations can be considered negligible and that the reaction intermediates are therefore in a quasi-steady state, an approach which had been proposed six months earlier in 1913 by David Leonard Chapman. Throughout his work and publications, Bodenstein developed, implemented and disseminated this quasi-steady state approach, of which he was one of the biggest advocates.
In the same period, chemical kinetics was also extended to systems other than homogeneous gas phase reactions. In 1913, German biochemist Leonor Michaelis and Canadian physician Maud Leonora Menten investigated the kinetics of an enzymatic reaction, in which an invertase catalyzed the hydrolysis of sucrose into glucose and fructose. They proposed a two-step reaction mechanism, in which the first step was at quasi-equilibrium, and developed the corresponding rate equation. Reactions on surfaces were studied in a large number of papers, as many elements were found to promote a large range of reactions. The understanding of the elementary mechanisms occurring in heterogeneous catalysis took a great step forward in 1915 when Irvin Langmuir, who worked at the General Electric research laboratory in Schenectady, New York, developed a simple but quantitative theory of adsorption of gases on metallic surfaces that predicted the surface coverage as a function of the gas phase conditions. In 1925, British chemist Hugh Stott Taylor, who worked at Princeton University, distinguished physisorption and chemisorption, and suggested that a catalytic reaction is not catalyzed over the entire solid surface of the catalyst but only at certain locations, which he called active sites. As these chemically active sites can be sparse on the catalyst surface, the reaction can be inhibited by relatively few molecules. Subsequently, in 1926, English chemist Cyril Norman Hinshelwood proposed a general mechanism for reactions on surfaces, and used the Langmuir adsorption isotherm to derive the well-known Langmuir-Hinshelwood rate equations for catalytic surface reactions, in which two molecules, both adsorbed on the catalyst surface on adjacent sites, give rise to a bimolecular surface reaction.
Based on the kinetic theory of gases, German chemist Max Trautz and British scientist William Lewis, in 1916 and1918 respectively, independently derived and published the collision theory, unaware of each other's work due to World War I. The collision theory starts from the premise that a reaction can only happen when two molecules collide. In this representation, molecules are symbolized by two colliding spheres that undergo reaction if their kinetic energy exceeds the activation energy E, or else simply fly apart after an elastic collision if they do not have enough kinetic energy. Using the Boltzmann distribution for the velocities of the molecules, the reaction rate contains the concentrations of the reactants, the rate of collisions, and an exponential factor containing the activation energy. In this way, the collision theory provides a framework for calculating the preexponential factor of a reaction. This theory gives relatively accurate values for pre-exponential factors of reactions between monatomic species (as expected from its basic assumptions), but often fails by several orders of magnitude for reactions between molecules with more complicated structures.
In 1922, Frederick Alexander Lindemann proposed a multi-step collision model to explain the variation of the apparent reaction order of monomolecular decomposition reactions. He assumed that these monomolecular decomposition reactions consist of three elementary steps, in which molecules were activated by bimolecular collision processes and then either followed by monomolecular dissociation steps or 'deactivated' by a bimolecular collision. Soon after that, in 1926, Cyril Norman Hinshelwood developed a more quantitative approach for the Lindemann mechanism by modeling the internal degrees of freedom of the reactant as several equivalent simple harmonic oscillators, and by using statistical methods to determine the probability of the molecule being activated. Further extensions were developed, in 1927 by Oscar Knefler Rice and Herman Carl Ramsperger, and independently by Louis Stevenson Kassel in 1928, by assuming that a molecule is a system of loosely coupled oscillators, leading to the Rice-Ramsperger-Kassel (RRK) theory for unimolecular gas reactions.
Around 1930, Michael Polanyi, director of the Haber Laboratory at the Kaiser Wilhelm Institute of Berlin, worked together with a young Mexican-American visiting scientist named Henry Eyring on the application of the wave equation for hydrogen to the quantum mechanical description of the atom exchange reaction between ortho-and para-hydrogen molecules. Their starting point was Fritz London's approximate solution of the wave function for the atomic transition complex H 3 . Polanyi and Eyring separated the nuclear and electronic motions, now referred to as the Born-Oppenheimer approximation, and calculated the energy of the system by considering that the reacting system's electronic structure would rearrange to an equilibrium state in every nuclear configuration. They refined their calculations by using spectroscopic results to refine their estimates of electronic energies, creating an innovative 'semi-empirical' method. In their fundamental paper from 1931, they described bimolecular gas reactions between three atoms and were able to exactly calculate, by means of quantum mechanics, the energy of the molecular and atom states, and hence the activation energies, relying both on quantum theory and on experimental data. In this way, they constructed the first Potential Energy Surface by plotting the energy as a function of the distances between the three atoms.
After Henry Eyring returned to the United States to become an instructor at Princeton University in 1931 and Michael Polanyi moved from Berlin to the University of Manchester in 1933, both continued their previously joint work on the energetic description of all configuration states of a chemical system and the practical exploration of PES. In 1935, with only one month difference, both Henry Eyring in Princeton, and Michael Polanyi with Meredith Gwynne Evans in Manchester, independently published a paper on a theory that allowed the description of elementary steps and the calculation their pre-exponential factors, known as the Absolute-Rate Theory, Activated-Complex Theory (the term coined by Eyring), or Transition State Theory (the term coined by Polanyi). This theory combines quantum mechanics (to evaluate the PES) and statistical mechanics (to calculate the reaction rate). By evaluating the ratio of the partition functions of the transition state in the PES diagram and the fundamental state of the reactants, they obtained an Arrhenius-type equation that allows the calculation of the values of both the pre-exponential and the exponential factors; the Eyring equation. With this, the last fundamental element of the theoretical framework was added to the classical 'Chemical Kinetics' approach.
Since the development of PES and the transition state theory in the early 1930s, many theoretical concepts have been introduced, further developed, refined and extended. Hence, only a short overview will be given. In 1931, Lars Onsager worked on non-equilibrium kinetics and postulated the Onsager regression hypothesis, stating that microscopic fluctuations and macroscopic relaxation are linked, some twenty years before it was finally proven to be true by the so-called fluctuation-dissipation theorems. In 1931, Jens Anton Christiansen studied reactions in open sequences, proposed the catalytic cycle, and derived general formulae for reactions on single catalytic sites. Around 1936, he also developed a reaction rate theory by modeling a chemical reaction as an intra-molecular diffusion process. This diffusion description was fully systematized in 1940 by Dutch physicist Hendrik Anthony Kramers and is complementary to quantum mechanical transition state theory, which was shown to be a particular case of Kramers' pure classical method. In 1934, Soviet physicist Nikolay Nikolayevich Semenov was the first to describe the so-called branched chain reaction. He determined the various steps in the mechanisms of chain processes and developed a general and quantitative chain chemical reaction theory that considered both branched and unbranched chain processes. Semenov applied his theory to various types of chain reactions and, more especially, to combustion reactions to study flame spreading, detonation, and the burning of explosives. Several of these aspects were also independently derived by Cyril Norman Hinshelwood, but Semenov extended this work by proposing a theory of degenerate branching, which led to a better understanding of the phenomena associated with the induction periods of oxidation processes. In 1939, Japanese scientist Juro Horiuti, who worked on electrode kinetics and heterogeneous catalysis, introduced the concept of stoichiometric numbers for a reaction mechanism, leading to the derivation of generalized rate expressions. Together with Michael Polanyi, he also proposed the Horiuti-Polanyi mechanism for heterogeneously-catalyzed hydrogenation reactions in 1934. In 1943, Olaf A. Hougen and Kenneth M. Watson extended the Langmuir-Hinshelwood approach by explicitly considering the reverse reaction. They also used the same approach for reaction mechanisms in which the adsorption or the desorption step was rate limiting. After World War II, Rudolf Arthur Marcus extended the RRK theory for unimolecular gas reactions in 1952 by accounting for the individual vibrational modes of the system, in line with transition-state theory, thereby enabling the calculation of simple estimates for unimolecular reaction rates from a limited number of characteristics of the PES. In 1954, Dutch scientists Pieter Mars and Dirk Willem van Krevelen described their well-known multistep reaction mechanism for heterogeneously-catalyzed oxidation reactions, in which lattice oxygen is used to oxidize the reactant while gas-phase oxygen replenishes the lattice vacancies. In 1959, Soviet chemist Boris Pavlovich Belousov published the existence of non-linear oscillating reactions, now known as the Belousov-Zhabotinsky reaction, and initiated the field of modern nonlinear chemical dynamics, illustrating that chemical reactions do not have to be dominated by equilibrium thermodynamic behavior. In 1966, Michel Boudart proposed the concept of turnover frequency. He also coined the terms Most Abundant Surface Intermediate (MASI) and structuresensitive reaction. A last trend worth mentioning, but which falls outside the direct scope of chemical kinetics, concerns direct observations and analytical measurements: from the 1950s onwards, kinetic studies were also strongly influenced by the development of various new analytical techniques, and in the case of heterogeneous catalysis, by novel surface study techniques, which culminated in the work of Gerhard Ertl, who provided detailed observations showing how chemical reactions take place on surfaces.
1 THEORETICAL BACKGROUND

Elementary Reactions
An elementary reaction, also called an elementary step or elementary process, is a direct transformation of one or more reactants into one or more products, in which all bonds between the reacting species are broken and formed simultaneously. Figure 1 schematically illustrates the potential energy diagram for the gas phase synthesis of hydrogen iodide. When the collision between a hydrogen molecule and a iodine molecule provides enough energy to create the transition state, this elementary reaction can proceed to form hydrogen iodide.
Most elementary reactions are mono-or bi-molecular, even though tri-molecular elementary reactions have been reported on rare occasions. Higher order reactions cannot be considered as elementary reactions. Indeed, the probability that three or more reactant molecules are at the same time in the same location with the right configuration, conformation and orientation, and with sufficient energy for reaction is extremely low or non-existent.
Elementary reactions have the following properties: the reaction proceeds in a single step; the reaction passes over a single transition state; no reaction intermediates appear during the transformations of reactants into products; the reaction can proceed in both directions over the same transition state (principle of microscopic reversibility); the reaction rate for the forward and the backward elementary step is given by Guldberg and Waage's mass action law: the overall order of reaction is given by its molecularity; the reaction order with respect to each reactant is equal to the stoichiometric coefficient of the reactant; the rate coefficient is independent of composition and follows the Arrhenius law: the ratio of the forward to the backward rate constants equals the thermodynamic equilibrium constant: The latter equation is called the 1 st relation between thermodynamics and chemical kinetics. By substituting the Arrhenius law and the van 't Hoff relation into this equation, and developing the equations for the different thermodynamic reference states, one also derives the 2 nd relation between thermodynamics and chemical kinetics, which relates the heat of reaction (DH R ) to the difference between the forward and the backward activation energies (E + À E -). Further substitution yields the 3 rd relation between thermodynamics and chemical kinetics, which relates the reaction entropy (DS R ) to the ratio of the forward and the backward pre-exponential factors (A + /A -).
Given the above-listed properties, the kinetic treatment of elementary reactions is relatively simple, since no reaction intermediates appear between reactants and products.

Multistep Reactions
When not all bonds are broken and formed at the same time, intermediates appear in the reaction pathway going from the reactants to the products. In such a case, the reaction is called an overall or multistep reaction, but many other terminologies are used: global reaction, apparent reaction, operational reaction, complex reaction, composite reaction, multiple step reaction, stepwise reaction, etc. An example of a multistep reaction is illustrated in the schematic energy diagram in Figure 2. As can be seen, in this overall reaction, the reactants A and B are transformed into products R and S in four steps, passing over four transition states and leading to three distinct intermediates. Put differently, this overall reaction can be decomposed into a sequence of four individual sub-steps or elementary steps. The information on the precise course of reaction in progress, i.e. the set of elementary steps, is called the reaction mechanism. Most chemical reactions are global (i.e. non-elementary) reactions. In many cases, the intermediates are short-lived highly reactive species. Hence, these reactive species are often present in trace concentrations in an analyzed phase, or are species in a non-analyzed phase (e.g., an adsorbed phase) such as the catalyst surface.
From the thermodynamic side, a number of thermodynamic relations can be written that link the individual elementary steps to the overall reaction. To this end, we will first define the stoichiometric coefficients and the stoichiometric numbers. The stoichiometric coefficients m j apply to a component in a reaction, and arise from the atomic balance for the reaction. The stoichiometric numbers r i apply to an elementary step in an overall reaction, and indicate how many times this elementary step is required in the reaction mechanism to arrive at the overall reaction. For an overall reaction with N species whose reaction mechanism consists of s elementary steps, the equilibrium constant of the overall reaction is given by the product of the equilibrium constant of each elementary step to the power of its stoichiometric number: If one only looks at the enthalpy term is this equation, one can derive that the overall heat of reaction corresponds to the heat of reaction of each elementary step, multiplied by its stoichiometric number: An example of these thermodynamic relations is given in Figure 3 for the catalytic synthesis of ammonia from hydrogen and nitrogen.
On the side of chemical kinetics, the objective is to be able to calculate the rate of the overall reaction. In chemical engineering approaches, empirical expressions such as power laws are often fitted to a set of experimental data without accounting for the reaction mechanism and the various reaction intermediates. By proceeding in such a manner, the physical and chemical background is completely ignored, and such power law models often have a very limited applicability. The rigorous approach consists of applying the mass action law to each individual step, leading to a set of s rate equations. For each species, a net production rate can now be written as a function of the reaction rates: Depending on the detail that is required and the information that is available, three main solution methods exist to calculate the rate of the overall reaction: the complete solution, the quasi-steady state approximation, the rate-determining step approximation.
The complete solution is obtained by solving the kinetics for the most general case. In this approach, the evolution of the N species is obtained by solving the set of N production rates (containing the s mass action law rate equations of the elementary steps) without any approximations or additional hypotheses. For extremely simple cases, an analytical   Figure 2 Illustration of a complex reaction. solution can be found, but most reaction mechanisms require the solution of these equations numerically. The advantage is that this approach is able to describe the transient kinetic behavior of the reaction, including the dynamics of all reaction intermediates. Numerical solutions are in most cases obtained by using the 'mean-field approach'. The mean-field approximation assumes: that all species are randomly distributed, that the interaction of all other individuals on any given individual may be replaced by a single averaged or effective interaction, i.e. the mean field that is due to the neighboring particles. The mean-field theory is based on the assumption that the fluctuations around the average value of the variables (pressure, temperature, concentrations, etc.) are sufficiently small to be averaged out over large ensembles and hence neglected. While the underlying approximation may easily be fulfilled in gaseous or liquid systems, this is not necessarily true for reactions on solid surfaces, where the interactions between adsorbed species may cause the formation of ordered phases. Additionally, energetic heterogeneities on the surface, surface diffusion and attractive or repulsive interactions between adsorbed species will lead to stronger deviations from the mean-field assumptions. When the mean-field approximation breaks down, the numerical solution has to be calculated by means of Monte-Carlo (MC) simulations, which offer in principle an exact description of the interplay between the elementary steps. A major drawback in all of the above solution techniques (analytical solution, mean-field approximation, or MC simulations), however, is that all kinetic parameters of all elementary steps need to be known in order to be able to calculate the complete solution.
The second approach to finding the reaction rates is the Quasi-Steady State (QSS) approximation. In order to simplify the set of equations and reduce the number of rate parameters in the system, this approach assumes that there is no net rate of production for the M intermediates in the reaction mechanism. Indeed, highly reactive intermediates will react as soon as they are formed. Hence, after an initial induction period, these intermediates reach a QSS concentration, generally in inferior concentrations. Although an actual steady-state can only be truly reached in open reactors, the QSS approximation is also applied for closed systems (batch reactors) as relaxation times (the time taken to reach the steady-state) are generally shorter than the reciprocal turnover frequency. For each intermediate, its net production rate is therefore set to zero, resulting in a subset of M algebraic equations in M unknown concentrations. Solving this subset of equations makes it possible to obtain the concentrations of the M intermediates as a function of the concentrations of the remaining measured species and the rate parameters. In many cases, analytical expressions can be obtained for the concentrations of the M intermediates, leading to a closed analytical expression for the rate equation of the overall reaction. The main advantages of the QSS approximation are that fewer kinetic parameters are required to obtain the reaction rate, and that an analytical expression is often obtained. In contrast to the complete solution, timedependent phenomena within the full reaction mechanism cannot be accounted for, such as branching chain reactions that lead to explosions.
The third approach is the Rate Determining Step (RDS) approximation, also called the quasi-equilibrium approximation. In this approach, one assumes that one of the elementary steps determines the overall reaction rate, while all other elementary steps are sufficiently fast to be considered as being in quasi-equilibrium. Hence, for a reaction mechanism with s elementary steps, the assumption of quasiequilibrium for all but one elementary step leads to a subset of s À 1 algebraic equations. This allows the expression of the concentrations of the various intermediates as a function of the concentrations of the measured species and the equilibrium constants of the elementary steps. Equaling the overall reaction rate to that of the rate-determining step and eliminating the concentrations of the intermediates systematically leads to an analytical expression for the rate equation that depends on the concentrations of the measured species, the rate constant of the rate-determining step, and the equilibrium constants of the steps in quasi-equilibrium. In addition to the latter advantage, the rate-determining step approach leads to even fewer kinetic parameters which need to be determined. As the assumptions are stronger, the resulting rate equation is a more limited description of the reaction mechanism. In addition to the fact that the time-dependent dynamics of the full mechanism cannot be described, as for the QSS approximation, the rate equation from this approach is no longer able to account for a shift in the rate-determining step as a function of the reaction conditions.
Besides these three main solution methods, other variants in solution techniques or hypotheses can also be used. One can use two (or more) rate-determining steps to allow the rate equation to shift between rate-determining steps as a function of reaction conditions, an approach which is intermediate between the QSS and the RDS hypotheses. If one wants to further reduce the complexity of the rate equation and hence the number of rate parameters, one can use the irreversible step approximation by imposing that one or more steps in the reaction mechanism are irreversible, resulting in an even less general rate expression. For catalytic reactions, one can choose to neglect a number of adsorbed reaction intermediates by using either the Most Abundant Reaction Intermediate (MARI) approximation, sometimes called the MASI approximation, which considers that the catalytic surface only consists of empty sites and MARI sites, or the Nearly Empty Surface (NES) approximation, by considering that all intermediates are very weakly bound and that the catalytic surface is almost empty. In all these cases, a choice must be made between the generality of the rate equation and the number of rate parameters that need to be provided.

Reaction Rate Constants
As already recognized by Arrhenius, rate processes are characterized by rare events, or, formulated differently, state-changing events (reactions) take place on a long time scale when compared to the dynamic time scales involved in the (vibrational, rotational, torsional, translational) motion of molecules. Reaction rate theories aim to understand how an elementary step actually happens and to predict the rate parameters of a reaction, i.e. the pre-exponential factor and the activation energy, from these microscopic descriptions.
Although several rate theories have been developed to calculate these kinetic parameters theoretically, two main approaches have been developed and improved over the years: the collision theory and the transition state theory.
In the collision theory, an elementary reaction happens when suitable reactants collide with sufficient energy and with the correct orientation and conformation. Based on the kinetic theory of gases and the Maxwell-Boltzmann distribution for the velocities of the molecules, the following reaction rate is obtained for a bimolecular reaction: From this expression, the pre-exponential factor can be isolated and directly calculated on theoretical grounds: The collision theory gives relatively accurate values for pre-exponential factors of reactions between monatomic species (as expected from its basic assumptions), but often fails by several orders of magnitude for reactions between molecules with more complicated structures. The reason for this is that the reacting molecules have been considered to be spherical and hence able to react in all directions. For actual molecules, this is no longer true, as the orientation of the collisions and the conformation of the molecules is not always suitable for reaction.
The most influential rate theory is the transition-state theory. It is based on the evaluation of the barriers on the PES between the transition state, or activated complex, and the reactants. This theory combines quantum mechanics (to evaluate the PES) and statistical mechanics (to calculate the reaction rate). By evaluating the ratio of the partition functions of the transition state in the PES diagram, and the fundamental state of the reactants, the Eyring equation is obtained for the rate constant: The transition-state theory tries to predict the rate constant from the initial reactive flux of molecules going over the activation barrier, and will therefore always overestimate the rate constant. Although Eyring recognized the overestimation and introduced a 'transmission coefficient', it has not been possible to calculate the transmission coefficient within the transition-state theory itself. Hence, the transition-state theory provides a clear upper bound for the rate constant. Even with this limitation, additional difficulties appear, first of all in obtaining a precise knowledge of PES, and secondly in finding the correct configuration of the activated complex. Indeed, the most accurate quantum chemical techniques are computationally very demanding and can therefore only be applied for fixed-geometry energy calculations. Cheaper methods, such as DFT, are needed to map out the PES and to calculate the vibrational frequencies of the transition state. Unfortunately, the DFT functionals are currently not sufficiently reliable to provide accurate rate constants. Hence, the transition state theory, an educated guess for the nature of the activated complex, and modern computational chemistry tools often allow the prediction of rate constants for bimolecular reactions only within an order of magnitude, or at best within a factor of three. Alternatively, Benson (1976) gives a list of rules for estimating values for DS 6 ¼ and DH 6 ¼ that are needed in the Eyring equation on the basis of the characteristics of different types of reactions, for reactions both in the gas phase and in solution. Benson also gives tables of 'group contributions'. For fast estimates of organic reaction rates, Benson's rules are often the method of choice. The transition-state theory is therefore primarily used to understand how a chemical reaction takes place, but has been less successful in its original goal of directly calculating rate constants on theoretical grounds.
Since rate parameters can still not be determined with sufficient accuracy from reaction rate theories, the preexponential factors, activation energies, adsorption constants, etc., are in most applications determined based on experimental data. This is generally done by parameter estimation, also called parameter identification, techniques that determine the rate parameters in the rate equations by minimizing the deviations between the model predictions and the experimental results.
The classic approach starts by carrying out a large set of kinetic experiments. From the analysis of these data, one or several reaction mechanisms are then proposed, and their corresponding rate equations are derived. After estimating the parameters of each of the competing kinetic models, the various models are ranked according to a goodnessof-fit criterion, and the best kinetic model is finally selected. In this approach, since the model selection is performed on the current data set (operating conditions, feedstock compositions, etc.), the selected reaction mechanism is often oversimplified and one should carefully investigate the reliability and robustness of the resulting kinetic model, especially when extrapolating outside the domain of the data set.
More recently, the parameters of the rate equations can be determined on the basis of a microkinetic analysis. Such a microkinetic analysis uses input from surface science, from computational chemistry, and from in situ or in operando kinetic studies to gain insight into the reaction intermediates and the actual elementary steps, and to independently determine the value of some of the rate parameters from separate experiments, thereby reducing the number of rate parameters that need to be estimated from the kinetic experiments. In this way, a more fundamental kinetic model can be obtained that integrates the knowledge of the various research fields.

DESCRIPTION OF THE PROBLEM
In the sections above, the general framework for chemical kinetic modeling has been described. This theoretical background allows the development of kinetic models not only for simple systems, but also for very complex processes. In the latter case, several difficulties arise very rapidly.
Modeling complex processes usually involves several thousand components and reaction intermediates. In the case of petroleum refining processes, the number of molecules and the number of reactions grows almost exponentially with carbon number. Even the enumeration of the isomers quickly becomes impossible without computer algorithms (Tab. 1). To give some orders of magnitude, processes treating naphtha fractions typically deal with up to a thousand components that contain between 5 and 11 carbon atoms, while for processes treating middle distillates, there are already several hundred thousand components that contain typically between 12 and 24 carbon atoms. For vacuum gas oil fractions that typically contain molecules with 25 to 40 carbon atoms, the number of possible isomers exceeds 10 14 , even when hetero-atoms such as sulfur, nitrogen and oxygen are not accounted for. For Vacuum Residues (VR) containing molecules with more than 40 carbon atoms and with high amounts of hetero-atoms, the number of isomers is almost intractable.
Between the components of this extremely large set of molecules, an even larger number of reactions can occur. The reaction network will need to describe the chemistry as precisely as needed for the understanding of the various phenomena, but also to improve the prediction of the process performances and the product qualities. Again, full reaction networks can only be computer-generated.
Associated with these large reaction networks are network reduction techniques. Indeed, for complex feedstocks, the reaction network becomes so large that it becomes very difficult to handle during the simulation. Two main classes of network reduction techniques can be distinguished: a priori lumping techniques and a posteriori lumping techniques. In the former, the full network is not created, but a smaller version of it is proposed, grouping components into 'lumps' and reactions into 'lumped reactions'. In a posteriori lumping techniques, the full reaction network is first generated, before it is reduced into a more tractable network by a scientific approach.
For each reaction of the final reaction network, a rate equation needs to be proposed or generated, to which a number of rate parameters will be associated. For large reaction networks, one not only needs to propose ways to automatically generate the rate equations, but also to reduce the number of rate parameters that need to be identified.
Last but not least, in order to be able to simulate the evolution of the composition in the reactor, one needs to provide an appropriate description of the feed, which can either be a molecular level description of the feed when detailed modeling techniques are used, or a lumped description of the feed.
No matter which modeling approach is selected to simulate complex chemical systems, the four following steps need to be clearly defined first: 1. Feedstock composition: What is the required detail? 2. Reaction network: What reactions need to be considered? 3. Rate equations and rate parameters: At what rate is each component formed? 4. Continuity equations: How is the evolution governed inside the reactor? In this review, detailed reactor models (including vaporliquid equilibrium thermodynamics, heat and mass transfer between phases or inside catalyst particles, external heat exchangers, etc.) and their corresponding continuity equations will not be discussed.
In what follows, systems with several hundred or several thousand components, reaction intermediates, reactions and elementary steps will be considered. In such cases, two main strategies can be applied: a lumping strategy or a detailed modeling strategy. In the first strategy, the chemical complexity is reduced by grouping chemical compounds into families or 'lumps' and grouping individual reactions into lumped reactions between these families. In this case, a mean-field approach together with a rate-determining step approximation is generally used, as will be shown in the examples below. The corresponding rate equations are then written at a macroscopic level for a limited number of pathways between the analytically observable lumps of components, while the corresponding continuity equations are solved by means of a classic determinist solver for a set of ordinary or Partial Differential Equations (PDE). In the detailed modeling strategy, the chemical detail is generally retained at the molecular level, or even at the level of the reaction intermediates. For these cases, we will describe techniques to provide a detailed feedstock composition, to generate and reduce full reaction networks, to derive the corresponding rate equations, and to decrease the number of rate parameters. Concerning the reaction rates, one can either use a mean-field approach and apply the QSS approximation, or calculate the complete solution by means of MC methods, as will be illustrated below.

General Information
The lumping approach consists of regrouping chemical compounds by similar properties called 'lumps'. The lumps are then considered as homogeneous ensembles on which a kinetic model normally used for the molecular compounds can be applied. This approach is often used for processes where the molecular characterization of the reactant mixtures is difficult or impossible because the feedstock is too complex, as is the case in the majority of petroleum refining processes (catalytic reforming, hydrotreating, hydroprocessing, catalytic cracking, thermal coking, etc.).
The development of a lumping approach usually proceeds through the following steps: 1. description of the feedstock by choosing a set of lumps; 2. description of the relationships between the lumps by building a kinetic network of lumped reactions; 3. determination of the rate equations and their associated parameters by optimizing the model on experimental data. The choice of lumps is always a compromise between the capabilities of the analytical techniques to characterize and quantify them on one hand, and the needs of the final user in terms of model prediction and precision on the other. In most cases, the analytical techniques are the limiting step and force the choice of the lumps. In the field of petroleum refining, these analyses can be subdivided into chemical separation techniques (Mass Spectrometry (MS), Liquid Chromatography (LC), Solvent Extraction (SE), etc.) and into physical separation techniques (distillation mainly). Historically, lumped models were developed from the 1960's onwards, at a time when petroleum analyses were very simple but yielded little information, and when computing power was limited. By way of example, the first lumped model for the Fluid Catalytic Cracking (FCC) process  (Cayley, 1875;Read, 1976;Hendrickson and Parks, 1991;Bytautas and Klein, 1998;Faulon et al., 2005) Carbon proposed by Weekman and Nace (1970) contained only 3 lumps corresponding to 3 distillation cuts: 'Feed', 'Gasoline' and 'Gas+Coke'. Over time, thanks to the development of more efficient separation techniques and the increase in computing power, lumped models became more and more complex with a continuous increase in the number of lumps. For the catalytic cracking process, Jacob et al. (1976) proposed an extended 10-lump model in 1976, while in 1994, Pitault et al. (1994) further increased the complexity by introducing an 18-lump kinetic network. More recently, at the end of the 1990s, the lumping approaches were deeply modified with the introduction of 'reconstruction methods', which allow the definition of several hundred to several thousand chemical lumps from a set of limited analyses and expert knowledge. For example, Christensen et al. (1999) proposed a kinetic model for catalytic cracking with approximately 3000 lumps that is based on the Structure Oriented Lumping (SOL) approach developed by Quann and Jaffe (1992). These novel techniques will be detailed in the second part of this article. Figure 4 illustrates the evolution of the lumped kinetic models for catalytic cracking, as proposed by Weekman and Nace (1970), Jacob et al. (1976) and Pitault et al. (1994).

Advantages and Drawbacks of Lumped Kinetic Models
Despite their increasing complexity, lumped kinetic models are relatively easy to develop because the number of lumps and the number of reactions remain limited. Moreover, due to the multi-compound characteristics of the lumps, the reaction pathways are generally global with no intermediate species and the kinetic rate equations are often simple (pseudo-order reactions, Langmuir approach in heterogeneous kinetics, etc.). Their kinetic parameters (preexponential factors, activation energies, adsorption constants, thermochemical constants, etc.) are often determined by minimizing the deviations between model and experimental data coming from pilot units or industrial plants. This simplicity allows the collection of high-speed kinetic models that require limited computing power, a feature that is very interesting for the optimization and control of petroleum processes. This explains why this type of model has been predominant in petroleum refining for over 50 years. However, despite this advantage, lumping methods also have some drawbacks that need to be kept in mind. Firstly, the lumps are often defined as ensembles of compounds with similar physicochemical properties (determined by the analytical techniques) but not necessarily with similar reactivities. If the thermodynamic equilibria inside the lumps are not maintained by fast intra-lump reactions, the properties of the lumps will consequently be modified during the reaction due to the transformation of these internal compounds as a function of their own reactivities. In this case, the base hypothesis that considers that the properties of the lumps are homogeneous and constant during reaction is clearly wrong (Li and Rabitz, 1989;Wei and Kuo, 1969).
The second drawback of lumped models is that they are not associated to a molecular kinetic theory but are directly derived from experimental data coming from pilot units in most cases. In petroleum refining, these experiments are long and expensive. Consequently, even if the theoretical formulation of lumped kinetic models is relatively quick and cheap, their parameter estimation requires more time and money. Indeed, in order for lumped models to be robust and feed independent, a wide variety of experimental data in terms of operating conditions and even more importantly, feedstock composition is needed. Moreover, with the increase in both the complexity of the kinetic models and in the number of lumps, the analyses needed to describe the mixture become more and more important, further increasing the experimental cost for the development of lumped models.
Finally, the last main drawback of lumped kinetic models is related to their relative inability to determine the physicochemical properties of the effluent from its composition and the properties of its lumps. In practice, these models often use correlations in order to obtain an estimation of the desired product properties, but this approach is empirical and very limited. Moreover, because the properties of the lumps can change during the reaction, this estimation method remains somewhat arbitrary.

Examples of Lumped Kinetic Models Developed at IFPEN
Between 1995 and 2005, IFPEN developed lumped kinetic models for several hydroprocessing processes such as Atmospheric Gas Oil (AGO) hydrotreating or fixed-bed residue hydroprocessing. One of the first lumped models developed at IFPEN for the AGO hydrotreating was a 9-lump model (Fig. 5a) accounting for only 1 sulfur lump for dibenzothiophenes (+ H 2 S), 1 nitrogen lump for carbazoles (+ NH 3 ) and 4 hydrocarbon lumps (+ H 2 ) representing saturates, monoaromatics, diaromatics and triaromatics (Bonnardot, 1998;Magné-Drisch, 1995). The analyses required to define these lumps were very simple: elementary sulfur content for the sulfur lump, elementary nitrogen content for the nitrogen lump and MS for the other lumps. The reaction network contained 6 overall reactions. The reactions for aromatic hydrogenation were considered to be reversible, while the reactions for hydrodesulfurization and hydrodenitrogenation were defined as being irreversible. The rate equations were derived from the following hypotheses: existence of two   Evolution of the IFPEN lumped kinetic models for atmospheric gasoil hydrotreating: (a) Bonnardot (1998); (b) López García (2000); (c) López García et al. (2010). types of active sites for hydrogenation and for hydrogenolysis, Quasi Steady State Approximation for the intermediate species, Langmuir-Hinshelwood approach, and equilibrium for adsorption/desorption reactions. The experimental domain was relatively large in terms of H 2 partial pressure (from P base -22 bar to P base +28 bar), H 2 S partial pressure (from 0.5 to 5 bar) and Liquid Hourly Space Velocity (LHSV) (from 1 to 3 Sm 3 /m 3 /h), but only one temperature (T base ) and only one type of feedstock (two full range Light Cycle Oils, denoted LCO, and three LCO aromatic extracts) was used. The kinetic model was implemented in a single phase plug-flow reactor model, and contained 15 parameters, which were determined from a database of 90 experimental data points with 5 independent responses each. In 2000, López García improved the prediction of the model in terms of hydrodesulfurization (López García, 2000;López García et al., 2003) by introducing 3 sulfur lumps which represent the 3 most refractory sulfur classes of the AGO: DBT, 4-DBT and 46-DBT (Fig. 5b). The determination of the 3 sulfur lumps was determined by Gas Chromatography (GC) coupled with a sulfur-specific SCD detector. This extended reaction network contained 9 overall reactions. To reduce the number of rate constants in the rate equations, the author applied the RDS approach for the various reactions on the two types of active sites: hydrogenation sites and hydrogenolysis sites. The experimental domain was extended in terms of operating temperature (from T base À20°C to T base +10°C) and Liquid Hourly Space Velocity (from 0.5 to 4 Sm 3 /m 3 /h). Although three full range LCO and five LCO aromatic extracts were used, the feedstock variability was still limited, as only one type of feed (FCC derived feeds) was used. Further improvements concerned the introduction of thermodynamic constraints for the reversible reaction, and the use of a two-phase plug-flow reactor model based on a Grayson-Streed flash calculation. By introducing the effect of the temperature, the number of parameters was increased to 31, which were determined from a database of 122 experimental data points with 8 independent responses each. To further improve the prediction capabilities of the kinetic model for the AGO hydrotreating process, the direct lumping approach based on the analytical capabilities was abandoned, and a new kinetic model based on a feedstock reconstruction approach (see Sect. 4.1.2) was developed. For this model, the number of lumps were classified by chemical family and by carbon number (López García et al., 2010), resulting in a total to 597 lumps. The kinetic network was also extended by accounting for 14 reaction families related to hydrogenation/dehydrogenation, hydrodesulfurization and hydrodenitrogenation pathways (Fig. 5c). The corresponding rate equations were derived based on the presence of two types of active sites (hydrogenation and hydrogenolysis) using the RDS approach. The experimental domain encompassed a large domain of operating temperatures (from T base À20°C to T base +20°C), of operating pressures (from P base -57 bar to P base +30 bar), and of Liquid Hourly Space Velocities (from 0.3 to 4 Sm 3 /m 3 /h). The largest extension concerned the feedstocks, as 24 different types of industrial gas oils were included: straight run gas oils from a large variety of crude oils, LCO, coker gas oils and their mixtures. This feed diversity is particularly important to confer a high degree of robustness on the model.
Concerning fixed-bed residue hydroprocessing, IFPEN developed a series of lumped kinetic models based on two main separation analyses, the first by volatility (distillation), the second by polarity (Saturates/Aromatics/Resins/ Asphaltenes (SARA) LC) (Haulle, 2002;Le Lannic, 2006). Combining both analyses allowed the definition of 5 petroleum fractions: asphaltenes (ASP), resins (RES), aromatics of the 520°C+ cut (ARO+), aromatics of the 520°CÀ cut (AROÀ), and saturates (SAT). For each of these fractions, an elemental analysis was performed in order to determine its composition in terms of carbon, hydrogen, sulfur, nitrogen, oxygen, nickel and vanadium. The first kinetic scheme described the evolution of the quantity of these different fractions, with an additional specific focus for the sulfur, vanadium and nickel elements (Fig. 6a). The total number of lumps for this approach was equal to 19. The reactor was modeled as a single-phase plug-flow reactor with a Langmuir-Hinshelwood formalism to manage the adsorption of the different species on the catalyst. Subsequently, this model was generalized (Le Lannic, 2006;Verstraete et al., 2007) by defining 2 additional fractions (RES+ and RESÀ) and by decoupling each fraction by chemical element (C, H, O, N, S, Ni, V). This new approach was more rigorous, but the number of lumps was increased to 40 (Fig. 6b). This new kinetic model was coupled with a catalyst morphology model to account for the diffusion of asphaltenes and resins inside the catalyst and for the modification of the catalyst properties (porosity, surface area, etc.) caused by the deposition of nickel sulfides and vanadium sulfides by hydrodemetallization (Toulhoat et al., 2005). A third kinetic and reaction model modified both the lumped kinetic network (Fig. 6c), by introducing additional asphaltenes families, and the reactor model, by introducing diffusion/mass transfer limitations based on the Stephan-Maxwell equations (Ferreira, 2009;Ferreira et al., 2014Ferreira et al., , 2010.

DETAILED KINETIC MODELING STRATEGIES
Due to the complexity of the feedstock, but also because of the limitations in computer hardware and software, classical kinetic models of industrial processes, such as petroleum refining or petrochemical processes, have traditionally been based on a lumping approach, as illustrated above. However, over the last decades, these industries have been subjected to more stringent environmental legislation and tighter product quality constraints, which are phrased in terms of molecular or atomic composition of feedstocks and process products (Neurock, 1992). Novel kinetic models must therefore be  Evolution of the IFPEN lumped kinetic models for fixed-bed residue hydroprocessing: (a) Haulle (2002); (b) Le Lannic (2006); (c) Ferreira (2009). able to predict the performance of processes at the molecular level. This cannot be ensured by a lumping approach due to its limitations in describing the molecular composition throughout the entire reactor.
The limitations of lumped kinetic models therefore motivated the development of more fundamental and more detailed kinetic models. Two other approaches can be distinguished to model the complex process kinetics: a mechanistic approach and a molecular approach. The difference between these approaches resides in the level of detail at which the reaction pathways are described.
Mechanistic models are the more fundamental approach based on a detailed explicit description of the reaction pathways, including its reactants, elementary steps (such as homolytic bond scission, radical recombination, olefin protonation, ion deprotonation, b-scission, isomerization, hydride abstraction, etc.), and its reaction intermediates (such as ions and radicals). In this approach, a limited number of a priori assumptions are needed and the rate parameters are more fundamental in nature.
Molecular models result from an intermediate approach between the mechanistic and lumping approaches. Here, a chemical reaction system is modeled at the molecular level without including its reaction intermediates. The reactions are viewed as molecule-to-molecule transitions and each reaction is characterized by an overall rate constant. The effects of reaction intermediates are included in the rate equations by imposing assumptions during their derivation.
Both detailed kinetic approaches enable one to overcome the drawbacks related to the lumping approach, since they both retain a molecular description of the reaction system throughout the entire reactor simulation, which allows the creation of feedstock-independent models. However, such models expect a molecular description of the feedstocks, of the reaction pathways, and of the rate equations and rate parameters. In what follows, methods to determine a molecular-level description of the feedstock are first discussed in detail and illustrated. Subsequently, the simulation of detailed reaction networks is discussed by simultaneously treating the generation and reduction of the detailed reaction networks, and the rate equations, rate parameters and reactor simulation, before illustrating examples of various applications.

Molecular Description of the Feedstocks
The first step in the simulation of a detailed molecule-based kinetic model is to determine the molecular-level description of the feedstock. This step is a crucial component of the molecule-based kinetic modeling since the feedstock description provides the major part of the 'initial conditions' of a molecular reaction model. The molecular description can be obtained either by using, when possible, advanced analytical characterization techniques, or by numerically creating a molecular representation of the feedstock through composition modeling algorithms, here called molecular reconstruction methods.
In this section, the existing methods of obtaining the molecular description of the feedstock will be reviewed.

Analytical Characterization
Common analytical techniques used to obtain a molecular characterization of the feedstocks are MS and GC. MS consists of transforming molecules, in the gaseous state, into ion fragmentation patterns by electron bombardment. These fragments are then separated and detected by their massto-charge ratio (m/e). The graphical representation of the ion intensity as a function of m/e makes up the mass spectrogram which is unique for each molecule (Wauquier, 1994). The MS technique is suitable for quantifying an individual targeted compound or a small number of compounds (20-30 molecules) in a mixture. However, the main application of the MS technique, in particular in petroleum laboratories, is a quantitative analysis of the chemical families of the form C n H 2n+Z X where C and H are carbon and hydrogen, respectively, n is the number of carbon atoms, Z is the hydrogen deficiency and X refers to heteroatoms such as S, N, O (ASTM D2425, 2007ASTM D2786, 2007Fafet et al., 1999a,b;Fischer and Fischer, 1974).
The MS and GC techniques can identify and quantify several hundred molecules in a mixture. However, the number of hydrocarbon isomers increases exponentially with the number of carbon atoms. For example, the number of possible isomers for a molecule with 5 carbon atoms (C 5 ) is 8, with 10 carbon atoms (C 10 ) it is 474, and with 15 carbon atoms (C 15 ) it is over 40 000 (Hudebine, 2003). This is why, beyond C 10 , the analytical techniques are not able to obtain a direct identification of all molecular species.
Other advanced analytical methods, such as comprehensive two dimensional Gas Chromatography (GC 9 GC) (Bertoncini et al., 2013;Dutriez et al., 2009Dutriez et al., , 2010Dutriez et al., , 2011Giddings, 1987;Mahe et al., 2011;Mondello et al., 2002), High-Performance Liquid Chromatography (HPLC) , and Fourier Transform -Ion Cyclotron Resonance -Mass Spectrometry (FT-ICR-MS) (Qian et al., 2001), have been developed to provide a molecular description of complex feedstocks. Today, these advanced techniques do not provide quantitative molecular information about the distribution of compounds within a complex mixture, especially for the high carbon number range, but rather structural characteristics of the molecules. Moreover, they are generally very time-consuming, and the obtained data are often difficult to interpret.

Molecular Reconstruction Methods
In an effort to overcome the limitations of the analytical characterization, current research focuses on the development of composition modeling techniques, also called molecular reconstruction methods. The composition modeling consists of transforming available analytical information, such as the overall properties and molecular structural information, into a molecular representation of the feedstock.

Approach by Model Molecule
The first reconstruction methods consisted of generating a model molecule to represent the petroleum cuts. The objective of these methods is to generate a single molecule that contains the average characteristics of mixture to be represented.
The first reconstruction method proposed in the literature concerned coal constituents, and was developed by van Krevelen (1952). It consisted of creating a 2D molecule for a given fraction from its carbon content, hydrogen content, refractive index, density, and aromatic carbon content. Williams developed a molecule reconstruction algorithm for the aromatic fraction of an oil sample using molecular weight, elemental analysis, and 1 H NMR (Williams, 1958). The algorithm was also applied to four asphalt fractions samples, to a virgin gas oil sample and to a catalytic cycle stock sample. Brown and Ladner (1960) developed a similar approach for coal-like material. One of the most well-known methods was devised by Speight (1970) and creates a 2D molecule for an asphaltene fraction from its elemental analysis, average molecular weight and proton nuclear magnetic resonance ( 1 H NMR). From this analytical information and some assumptions, Speight developed 7 mass balances for different types of carbon atoms in order to determine the average hydrocarbon structure of the petroleum cut. Almost simultaneously, Hirsch and Altgelt (1970) proposed a similar technique that also used the average density as an analytical input. The authors also suggest to replace some assumptions with 5 parameters that had to be estimated by the modeler.
From the analytical information, the modeler can easily build a representative model molecule of a mixture. Additionally, most of these methods only need the overall properties which can be obtained without too much experimental effort. However, a model molecule is far from being representative of all molecules in a mixture. Indeed, a model molecule does not provide the effects of polydispersity of complex mixture in terms of physical properties and reactivities, which are essential for the process modeling.

Deterministic Molecular Reconstruction Approach
The deterministic molecular reconstruction approach consists of modeling the mixture composition by means of a set of molecules whose mixture properties are close to those of the mixture to be represented. The synthetic mixture of molecules is generated by estimating the mole fractions of the predefined database of compounds. The development of such an approach is usually done in the following manner: 1. creation of a predefined database of compounds, 2. definition of the objective function (analytical constraints), 3. modification of the mole fractions of the compounds.
The creation of a database of compounds is extremely important for the accuracy of the molecular reconstruction method, because this step chooses which compounds will be considered. Evidently, the absence of the key compounds in the database leads to an inaccurate molecular representation of the mixture. Depending on the type of mixture that is considered, a different molecular library has to be selected. The database is therefore created from qualitative analytical information and expert knowledge. As mentioned in Section 4.1.1, advanced analytical techniques can provide molecular-level information. This information can be a qualitative identification of the molecules for light mixtures, such as gasoline, or the identification of the molecular structures for more complex feedstocks, such as middle distillates or vacuum gas oils. Thus, the predefined database can be defined as a set of molecules, as proposed by Allen and Liguras (1991) or as a library of molecular structures, such as the SOL (Structure Oriented Lumping) approach proposed by Quann and Jaffe (1992) or the MTHS (Molecular Type and Homologous Series) method proposed by Peng (1999) and Zhang (1999).
Whatever the predefined database, the pure component properties of its compounds are required. For light compounds, pure component properties are often available in thermophysical properties databases, such as the NIST Chemistry WebBook (Linstrom and Mallard, 2015), TRC (Frenkel et al., 2000) or the DIPPR (Daubert and Danner, 1989). If the pure component properties are not available, they can be calculated either by inspection of the structure of the molecule, to assess chemical formula and molecular weight, by correlations, or by group contribution methods, which are frequently used to estimate normal boiling points, densities, and critical properties (Benson, 1976;Benson and Cohen, 1993;Constantinou and Gani, 1994;de Oliveira et al., 2013a;Joback and Reid, 1987;Marrero-Morejón and Pardillo-Fontdevila, 1999).
To calculate the various properties of a mixture of molecules from its composition, mixing rules need to be used that are based on the mole fractions of the compounds and their pure component properties. It is common practice to assume that the mixing rules are linear with respect to either the pure component property or its blending index , even if some of the properties cannot be blended linearly. For every piece of analytical information, an analytical constraint can be written. This set of analytical constraints is also associated to a number of physical constraints: a material balance to ensure that the sum of mole fractions is equal to 1, and mole fraction constraints that require each mole fraction to be positive and to fall within the range from 0 to 1.
Once the database of compounds is defined and the analytical and physical constraints have been set up, the mole fraction of each compound needs to be determined. This is generally done by minimizing the difference between the available quantitative analytical data and the properties calculated from the composition of the synthetic mixture. Even using linear mixing rules, the estimation of the mole fractions is often an ill-defined problem because the number of unknowns (mole fractions) largely exceeds the number of observables (analytical data). To reduce the number of unknowns, several techniques have been proposed in the literature. Allen and Liguras (1991) apply a constrained Simplex method to minimize a weighted criterion using the enthalpies of formation of the molecules as weights, yet respecting analytical constraints. Jaffe et al. (2005) suggest the minimization of a specific criterion using a Lagrange multiplier method. Zhang and co-workers (Ahmad et al., 2011;Aye and Zhang, 2005;Peng, 1999;Wu and Zhang, 2010;Zhang, 1999) developed an approach to estimate the mole fractions of a mixture from their overall properties by interpolating into a database of well-characterized mixtures. Recently, Pyl and coworkers (Pyl et al., 2011) use the parameterized probability functions to describe the distribution of mole fractions in the compounds. The deterministic molecular reconstruction approach is the most efficient of the reconstruction methods. Its conception is based on a large set of analytical data and few assumptions are needed concerning the presence or absence of a compound in a mixture. Moreover, this method is usually quite fast in determining a molecular composition of a mixture. However, this approach suffers from two shortcomings. Firstly, the development of the predefined databases is a very time-consuming and expensive task because of the advanced analytical tools that are needed to obtain the molecular information. Of course, once these databases are created, the determination of the molecular composition of new mixtures is fast, but the initial analytical work remains an extremely limiting factor. Secondly, this approach cannot be applied to the very complex mixtures for which no analytical technique is currently powerful enough to identify the chemical structure of the most abundant chemical species contained in these mixtures. This is typically the case for petroleum residues or biomass fractions.

Stochastic Molecular Reconstruction Approach
The stochastic molecular reconstruction approach models the molecular composition of a mixture from a set of Probability Distribution Functions (PDF) for molecular structural attributes (number of rings, chains length, number of chains, etc.). The governing principle of this approach is that any molecule can be considered as an assembly of molecular attributes. Each attribute is represented by a PDF which provides the probability of finding this attribute in a molecule of the mixture Trauth et al., 1994).
The concept of using PDF to describe complex mixtures has initially developed in the field of polymers for characterizing the molecular size distribution of polymers (Flory, 1936;Libanati, 1992). The application of the PDF was extended to the petroleum field by Klein's group (Campbell and Klein, 1997;Neurock et al., 1994;Trauth et al., 1994) based on detailed analysis of heavy fractions done by Boduszynski (Boduszynski, 1988(Boduszynski, , 1987. Boduszynski showed that the structural properties of a petroleum fraction follow statistical distributions when plotted against the Atmospheric Equivalent Boiling Point (AEBP), the molecular weight or the number of carbon atoms. Therefore, Neurock et al. (1994) suggested the characterization of petroleum fractions by means of a set of PDF for molecular attributes. They developed a Stochastic Reconstruction (SR) method to generate an equimolar set of molecules from a set of PDF for molecular attributes which are characteristic of the mixture to be represented. In practice, each PDF is sampled via a MC procedure to identify the structural attributes of a molecule, as illustrated in Figure 7. The sequence of the PDF sampling steps is oriented by a building diagram, which is defined by the modeler. The selected attributes are randomly assembled to obtain the structure of the molecule. MC sampling of the set of PDF can be repeated N times so as to obtain a mixture of N molecules.
The initial SR method imposes as an input a set of PDF parameters which are representative of the mixture to be reconstructed. However, the PDF parameters are either known from prior work, or obtained from detailed analytical methods, which are very time-consuming. Therefore, Klein's group  proposed to include an optimization loop to estimate the PDF parameters from the overall properties as in Figure 8. To do this, each PDF is described by a parameterized mathematical law (Gaussian distribution, gamma distribution, exponential distribution, etc.). For a set of PDF parameters, a mixture of molecules is generated, and their overall properties are calculated. The calculated properties are then compared to the experimental data by means of an objective function . The objective function is then minimized by modifying the PDF parameters. The optimization is performed using a global optimization method, such as simulated annealing, genetic algorithms, or particle swarm optimization.
The SR method is an ingenious technique to create a molecular representation of the complex mixture from the overall properties. To overcome the lack of molecular information, this approach supposes certain assumptions of similarity or uses high internal constraints that cannot be provided by the overall properties. However, the SR method also has disadvantages. The selection of the molecular attributes, the PDF and the building diagram are no trivial tasks to be performed by the modeler. The same mixture represented by different sets of molecular attributes and/or PDF may lead to different solutions. Secondly, this method is computationally very demanding because the optimization loop needs to generate a large number of molecules before reaching an accurate molecular representation Van Geem et al., 2007;Wu and Zhang, 2010). Finally, SR is not always capable a fitting all mixture properties simultaneously due to the numerous constraints imposed by the PDF and building diagram .

Examples of Molecular Reconstruction Methods
Developed at IFPEN Given the limitations of lumped models, IFPEN has developed the molecular-based kinetic models that prompted the development of molecular reconstruction methods. Reconstruction by Entropy Maximization (REM) (Hudebine andVerstraete, 2011, 2004;Hudebine, 2003) was the first method developed at IFPEN. The REM method is a deterministic molecular reconstruction approach that is based on the principle of maximum Shannon entropy, according to which an information entropy criterion must be maximized in order to obtain the optimal result (Shannon, 1948). The mole fractions of the predefined set of molecules are adjusted by maximizing the Shannon entropy criterion (E), given by Equation (9), subject to the mass balance (Eq. 10) and the analytical constraints (Eq. 11). The maximization of the criterion is performed by combining the Lagrange multiplier method and the conjugate gradient optimization method. This method was successfully applied to LCO gas oils (Hudebine, 2003), FCC gasolines  and steam cracking naphthas .
and Illustration of the SR method for creating an asphaltene molecule .
where E represents Shannon's information entropy, N is the number of molecules present in the predefined database of molecules, x j represents the mole fraction of molecule j, x is the vector of mole fractions x j , P k is the value of analytical property k, F k is the mixing rule for analytical property k, and K is the number of analytical properties or constraints. In order to extend the REM method to more complex mixtures, IFPEN developed a two-step reconstruction (SR-REM) method in which REM has been coupled to a SR method Hudebine, 2003). The SR method is first applied to generate an equimolar database of molecules that are typical for a given mixture. The REM method is then used to improve the mixture properties of the previously generated set of molecules by modifying their molar fractions. The SR-REM method overcomes the drawbacks of both reconstruction methods. The SR method is a very constrained method that creates an equimolar mixture, by generating molecules that are always typical of the studied mixture due to the use of specific molecular attributes and building rules based on analytical information and expert knowledge, but at final convergence, it only approaches the analytical data. In contrast, the REM method is a fast technique to obtain a mixture whose properties closely match the analytical data, but it needs a particularly well-defined initial set of molecules .
Two different SR-REM algorithms were developed at IFPEN: a direct algorithm and an indirect SR-REM algorithm (de Oliveira, 2013;de Oliveira et al., 2013a,c;Hudebine and Verstraete, 2004). The direct SR-REM algorithm (Fig. 9a) consists of applying the SR and REM methods as two consecutives steps to reconstruct a mixture.
Thus, an equimolar database of molecules is generated for each studied mixture which is then submitted to REM method to improve the mixture properties. However, as SR step of this reconstruction method uses a MC approach to generate its initial set of molecules, the direct SR-REM still remains a computationally-demanding algorithm. The indirect SR-REM algorithm (Fig. 9b) was therefore developed and may be applied when a large number of mixtures of a given type need to be represented. The idea behind this algorithm is to generate a large reference database of representative molecules for a type of mixture once and for all. Subsequently, only the REM step is employed, requiring only a few seconds of CPU time. For each mixture, the REM step starts from this reference database to obtain a new set of molecules whose properties are very close to those of the mixture to be represented. All synthetic mixtures will now contain the same molecules as the reference database, but with modified mole fractions.
The SR-REM method was developed for and validated on LCO gas oils de Oliveira et al., 2012;López Abelairas et al., 2016) and was later extended to vacuum gas oils (Charon- Revellin et al., 2011;Verstraete et al., 2004) Figure 8 Flow diagram of the SR method with the optimization loop. .
As mentioned in Section 3.3, IFPEN developed a feedstock reconstruction method, called statistical reconstruction, in order to improve the AGO hydrotreating kinetic model. Statistical reconstruction López García et al., 2010) is a deterministic molecular reconstruction approach that provides a matrix of structural lumps or pseudo-compounds to represent the composition of middle distillates. The rows of the matrix contain 28 different chemical families, such as paraffins, naphthenes, aromatics, sulfur compounds, while the columns represent the molecule size expressed by carbon atoms (from 1 to 30). Each matrix element contains the mole fraction of the pseudo-compounds which are estimated from four chemical analyses: MS, sulfur speciation, nitrogen speciation and simulated distillation. The first three analyses provide the overall mole fraction of the 28 chemical families, which are then distributed among the pseudo-compounds based on the simulated distillation data by using two gamma distributions. These statistical distributions represent the variation of the size of molecules within each family by means of an increase in alkyl chain length. The first distribution defines the chain length for acyclic structures (paraffins, mercaptans, amines) and the second distribution describes the length of the side chain grafted on the core of the cyclic compounds. For each chemical family, the distribution is adapted with respect to its minimum and maximum number of carbon atoms and to the initial and final boiling point of the simulated distillation.

Network Generation and Network Reduction
Generation of reactions and reaction networks fall under the larger concept of 'computer-assisted chemistry'. This vast field of 'computer-assisted chemistry' comprises a large variety of applications such as kinetics packages, network generation programs, tools for designing experiments, property estimation methods, structure elucidation software, structure visualization applications, computational chemistry, quantum chemistry, force field methods, computerassisted organic synthesis planners, and many more.
This illustrates that automated network generation techniques are well suited for any type of reaction, as long as one can define the transformations that are typical of a particular system. Most of these algorithms generate an exhaustive list of the vast number of reactions, thus avoiding error-prone manual reaction network developments. Good overviews of some of these automated network generation methods for different applications can be found in Ugi et al. (1993), Prickett and Mavrovouniotis, (1997b), Tomlin et al. (1997) and Klein et al. (2006). Ugi et al. (1993) tried to classify network generation algorithms and identified three types: empirical methods, semi-formal methods, formal methods.
According to their classification, empirical methods are based on pre-established reaction libraries and expert systems. Examples of such databases were already underway in the 1960s (Vleduts, 1963) and continue to be developed (Ihlenfeldt and Gasteiger, 1996). Semi-formal techniques are based on heuristic algorithms that generate reactions, either overall reactions or elementary steps, in a network from a few basic reaction types, but do not have selection procedures to remove improbable reactions from the set of conceivable solutions. Formal techniques generate the various elementary steps at the mechanistic level by taking into account the physical and chemical description of the molecule, and select the most probable elementary steps to be retained in the final network from all possible steps.
Once the reaction network is available, it can be used for simulation purposes. For the complex feedstocks, the reaction network may become very large and very difficult to manage along the simulation, however. In such a case, a direct simulation may not be possible without some reduction of the network. Reduction of a reaction network is not so easy. Complex mathematical tools are required that generally use the full knowledge of the reaction network, including the reaction rates in order to classify the reactions according to different scales of reactivity: 'fast' reactions and 'slow' reactions. During the development of the model, some qualitative information or orders of magnitude may be available, but quantitative information on the kinetics is rarely known. Hence, a priori reduction of a reaction network is usually based on empirical rules, such as limiting the maximum molecule size, excluding molecule types, ignoring reaction families, constraining the total number of species and/or reactions, selecting reactions on their reaction (free) enthalpy or on their presumed reaction rate, etc. A bad choice of these 'reduction rules' can easily lead to biased reaction networks, resulting in incorrect predictions and behaviors during simulation.
Rigorous reduction techniques aim to retain as much as possible the detail of the original reaction network. When the kinetic equations between species are known and sets of species are in thermochemical equilibrium, one can group the species in equilibrium into lumps, and write a new reduced kinetic network between lumps that is exactly equivalent to the original network (Wei and Kuo, 1969;Li and Rabitz, 1989;Vynckier and Froment, 1991;Cochegrue et al., 2011). In this sense, this is strictly an exact lumping technique as defined by Wei and Kuo (1969), since the composition of each lump is known through the equilibrium relations. Hence, when experimental data shows that the species of a lump are in thermochemical equilibrium, they proposed an a posteriori lumping scheme, also known as a late lumping scheme, which groups species at thermodynamic equilibrium inside a single lump. It is therefore a rigorous network reduction technique that allows the production of a simplified network, which can then be used in reactor simulations (Vynckier and Froment, 1991). Indeed, using such a rigorous network reduction method avoids the problems related to the diversity of feedstocks, unlike a priori relumping of species, also known as early lumping, which is generally a 'blind' lumping approach devised to cope with the lack of component-based analytical data and where empirical kinetics do not account for the composition of the lumps. Reduction techniques may also use stochastic network generation (Faulon and Sault, 2001;Mizan et al., 1998;Shahrouzi et al., 2008), or seeding and deseeding techniques Joshi et al., 1999;Watson et al., 1996).
Reaction network reduction techniques can be divided into five categories: global reduction, response modeling, chemical lumping, statistical lumping, detailed reduction (Frenklach, 1987).
Global reduction techniques transform a complete reaction scheme into a small number of overall reaction steps. These techniques comprise ad-hoc methods such as empirical fitting, reduction by approximations, and lumping. Global reduction techniques are problem-specific and cannot be generalized. Response modeling techniques consist of mapping model responses (species concentrations) and model variables (initial boundary conditions, rate parameters, transport properties) through functional relationships, typically combinations of simple algebraic functions such as polynomials, by using either experimental data or computer experiments. As for global reduction techniques, response modeling solutions are problem-specific since they require data to construct the algebraic functions. The chemical lumping method is guided by similarity in the chemical structure or chemical reactivity of the reacting species, i.e. the reactions in the entire network have essentially the same rate parameters, as is the case in polymerization reactions. The statistical lumping technique is used when growth processes can be described by the Smoluchowski equation. Finally, the detailed reduction technique consists of identifying and removing non-contributing reactions. An effective reduction strategy is to compare the individual reaction rates with the rate of a chosen reference reaction, which can be a given rate-limiting step or the fastest reaction. Consequently, the detailed reaction reduction approach is a general technique that is applicable to any reacting system.

Rate Equations and Rate Parameters
The rate equations for mechanistic kinetic models are straightforward. Indeed, due to its fundamental approach that explicitly describes the elementary steps, the rate of each reaction follows Guldberg and Waage's law of mass action. In molecular kinetic models, the reactions are viewed as molecule-to-molecule transitions, without accounting for the detailed reaction mechanism and its reaction intermediates. Hence, the rate equations may need to include the indirect effects of the reaction intermediates by using power-law type rate equations or Langmuir-Hinshelwood type rate equations. This type of rate equation can be derived using the above described QSS approximation, RDS approach, or one of their variants.
The number of rate parameters in a large reaction network will be huge. It is therefore almost impossible to obtain a value for each rate parameter through experimental work or through theoretical calculations. Hence, in almost all detailed kinetic models, rate parameters are typically defined for homologous series of reactions. The idea is that the reaction rate is determined by the transformation that happens at the 'reacting center' of the molecule, i.e. by the groups that are directly involved, while the rest of the molecule is only a spectator. This will be considered to be particularly true in large molecules. It should be mentioned, however, that this remains an important simplification of the reality, since actual molecules are continuously vibrating, bending and rotating in such ways that the 'non-reacting' moieties may still influence the reactivity.
For homogeneous acid-catalyzed reactions, Brønsted (Brønsted and Pedersen, 1924;Brønsted, 1928) observed that strongly exothermic reactions have a low activation energy and derived from his experimental work that the catalytic activity is directly related to the ionization constant K a . The Brønsted equation can be written as: but is generally given in its linearized form: The Brønsted equation can also be written differently. Introducing into this equation the van 't Hoff relation for the ionization equilibrium constant and the Eyring equation for the rate constant leads to: After rearranging, one obtains: or, This form of the Brønsted equation shows that the Gibbs free energy of activation for a reaction DG 6 ¼ is directly related to its Gibbs free energy of reaction DG a , i.e. the Gibbs free energy released by the reaction. The Brønsted equation is therefore historically the first example of a 'linear Gibbs energy relation' (the IUPAC recommended term), more commonly known as 'Linear Free-Energy Relationships' (LFER).
One of the most famous LFER is the Hammett equation used in organic chemistry. Hammett (1937) found that, for a large set of reactions involving benzoic acid derivatives with meta-and para-substituents, the ratios of the rate constants were related to the ratios of the ionization equilibrium constants: The Hammett equation is again generally written in a linearized form: or, where k is the reaction rate constant of a substituted reactant, k 0 is the reaction rate constant of the unsubstituted reactant, r H is the substitution constant, which represents the difference of the Gibbs free energies of ionization and accounts for field, inductive, and resonance effects, and q H is the sensitivity constant, which only depends on the type of reaction, not on the type of substituent. The original Hammett equation was later refined, leading to the Taft equation, the Swain-Lupton equation, the Grunwald-Winstein equation, the Yukawa-Tsuno equation, etc.
The origin of such linear free-energy relationships was first derived by Evans and Polanyi (1935, 1936, 1938, who approximated the PES by straight lines. This approximation leads to a theoretical justification of a LFER, a linear relation between the activation free energy DG 6 ¼ and the Gibbs free energy of the reaction DG R : where b 1 and b 2 depend on the slopes of the lines and the initial distance between reactants and products. This relation can be intuitively understood for endergonic elementary reactions, since the free energy barrier for activation has to increase when the free enthalpy of reaction increases. Within a series of closely related atom-transfer reactions, Evans and Polanyi (1938) also showed that the activation energies depend linearly on the reaction enthalpies. This 'Linear Activation Energy-Reaction Energy Relationship' or linear energy relationship is generally referred to as the Evans-Polanyi relationship and has the following form: Evans and Polanyi applied this relationship to organic reactions, but the Evans-Polanyi relationships were later shown to apply to many different types of atom transfer reactions. Semenov (1954Semenov ( , 1958) studied a large number of atom and radical reactions, extended the Evans-Polanyi relationships to chain reactions, and suggested a value of 48 kJ.mol À1 for E 0 , while a = 0.25 for exothermic reactions and a = 0.75 for endothermic reactions. Temkin was the first to apply the Evans-Polanyi relationships to heterogeneous catalysis, together with Mochida and Yoneda (1967a-c). Marcus (1968) approximated the PES by two intersecting parabolas whose minima correspond to the free energy levels of the reactants and of the products respectively. The free enthalpy barrier now corresponds to the point of intersection of the two parabolas, and its values is given by: where k is called the 'reorganization energy' and depends on the curvature of the parabolas and the distance between the energy minima of reactants and products. The group k/4, or ÁG 6 ¼ int , is called the 'intrinsic barrier', i.e. the free energy barrier for a reaction whose free energy of reaction equals zero. Marcus initially developed this equation to correlate solution reaction rates for electron transfer, but this relation has been proven useful for atom, proton, hydride, and group transfers as well. The above Marcus equation is sometimes referred to as a Quadratic Free-Energy Relationship (QFER). When the reorganization energy k is much larger than the Gibbs free energy of reaction, the curvature becomes negligible and the Marcus equation reduces to a LFER.
Numerous attempts have been made to extend such linear free-energy relationships or Evans-Polanyi relationships. In many cases, the developed relationships are arrived at by means of a statistical approach that relates the rate constant k or the activation energy E a to various physicochemical properties of the reactants and/or products. Furthermore, many of these empirical or semi-empirical equations no longer correlate the activation free energy DG 6 ¼ or activation energy E a to other free-energy terms or energy terms. Hence, one should no longer refer to such relations as linear free-energy relationships or linear energy relationships, but rather use the terms Quantitative Structure-Activity Relationships (QSAR), Quantitative Structure-Reactivity Correlations (QSRC) or simply the term Correlation Analysis. The basic premise behind these QSAR equations is that, for a homologous series of reactions, the rate constants can be correlated to some molecular property. The simplest form of a QSAR is the simple linear relationship: where k i is the rate constant of molecule i for a given reaction, h i is the reactivity index of molecule i, and a and b are the correlation coefficients for a given reaction family.
In many cases, a multilinear QSAR is developed for each reaction family: where k i is the rate constant of molecule i for a given reaction, h ji is the reactivity index j for molecule i, and the J+1 parameters b j are the correlation coefficients for a given reaction family. Many other forms of QSAR equations (quadratic, etc.) can be found, and some theoretical basis is given in Wold and Sjöström (1978).
In the literature, many examples of QSAR equations can be found, especially in pharmaceutical research, toxicology, and biological sciences, either with or without a theoretical basis. To illustrate the various levels of theoretical background, some examples of QSAR equations for applications in the refining industry are given here. Neurock (1992) derived a linear QSAR that correlated the rate constant for the dealkylation of alkyl aromatics to the heat of reaction, and a linear QSAR that correlated the rate constant for aromatics hydrogenation to the p-electron density. Korre (1995) developed a multi-linear QSAR for the rate constant for aromatics hydrogenation to the heat of reaction, the number of hydrogen atoms that were incorporated, the number of aromatic rings, and the number of naphthenic rings. Froment et al. (1994) developed a type of a group-contribution method for the rate constants for the hydrodesulfurization of all benzothiophene and dibenzothiophene isomers, by accounting for the steric and electronic effects of the various alkyl substituents and their positions around the sulfur atom. Liguras and Allen (1989) correlated the overall rate constant for catalytic cracking of paraffins to the total number of carbon atoms, and the number of carbon atoms of each type (primary, secondary, tertiary atoms). López García et al. (2010) used, depending on the reaction family, a value for the reaction rate that was either constant or a function of the carbon number. For the hydrocracking of vacuum gas oils, Stangeland (1974) proposed a non-linear correlation of the rate constant with the boiling point temperature of the fraction or the component.
In conclusion, LFER and QSAR are empirical correlations observed between rate parameters and equilibrium constants, rate coefficients and/or structural parameters, usually among a series of reactions within a given mechanism (Connors, 1990). They are an empirical but quantitative way of 'systematizing' the similarity of reactions in a well-defined domain of applicability. They imply that the steric and entropy terms remain almost constant through a homologous series of reactions. These relations are also referred to as 'extrathermodynamic relationships', since nothing makes these correlations necessary from a theoretical point of view (Chapman and Shorter, 1978). Despite this, they are relatively common, due to the fact that the structure of a molecule outside of the 'reacting center' has only a limited influence on the reaction rate, and are very useful in chemistry and in chemical engineering applications.

Reaction Network Simulation
Inserting a detailed kinetic model into a steady state onedimensional reactor model generally yields a set of coupled Ordinary Differential Equations (ODE) (for ideal plug flow reactors or batch reactors) or a set of coupled algebraic equations (for ideal continuous stirred tank reactors). When the detailed kinetic model is inserted into transient and/or more detailed reactor models, coupled PDE and/or Algebro-Differential Equations (ADE) arise. Such a set of equations can then be solved using classic numerical tools. This meanfield approach is the classic way to perform reactor simulations.
In some cases, it is not possible to arrive at a complete description of the reaction network. In other cases, the mean-field approximation breaks down, as can happen for surface reactions with energetic heterogeneities on the surface, surface diffusion, and/or attractive-repulsive interactions between adsorbed species. In such cases, a change in point of view is needed. One alternative way to simulate large systems is to change the basic description of the problems. Instead of following a macroscopic description of the mixture (concentrations, pressures, etc.), the evolution of a population of molecules and their transformations are described. By analogy to fluid mechanics, this is similar to the distinction between an Eulerian description (behavior of properties function pressure, specific gravity, velocity, etc.) and a Lagrangian description (individual motion and properties of a molecule).
An important class of models involves the description of random phenomena. The probabilistic character of chemical reactions has historically been used for the description of reacting ideal gases using the notion of colliding molecules and energy levels to determine whether a reaction will occur or not. In these applications, this description was not used for simulation purposes, but for the description of the microscopic events that lead to reaction. The theoretical background of the probabilistic description of reaction events is based on the Chemical Master Equation (CME). This equation provides the time-evolution of a system of countable states where switching between states is treated probabilistically. From a probabilistic point of view, the following set of differential equations is obtained: This equation is difficult to solve and has no analytical solution, except for some very simple academic cases. The development of computers and appropriate algorithms allowed the numerical simulation of those reaction events. This approach was formalized by Gillespie (1976), resulting in an algorithm (Stochastic Simulation Algorithm -SSA) that is as simple as the theoretical description (CME) is complex. Gillespie (1976) introduce the notion of a propensity function that represents the probability that a reaction occurs. By rewriting the equations, he proved that the evolution between two random events can be decoupled by the random generation of two numbers, the first giving the time of the next reaction, and the second choosing the reaction that will occur according to the propensity function. This algorithm is classically called the SSA and is shown in Figure 10.
Each realization of the algorithm gives a particular simulation. The average of a large number of reactions converges to the deterministic simulation of a mixture with the same macroscopic properties (Fig. 11). The size of the initial population has an important effect on the speed of the convergence (Shahrouzi, 2010;de Oliveira et al., 2013a).
With an a priori determined reaction network, there are no advantages of using the SSA to integrate the continuity equations. For very large reaction networks, however, the memory requirements for the SSA are generally lower than classic numerical integration packages.

Examples of Complex Reaction Network Simulations
Developed at IFPEN

Classic Single-Event Kinetic Modeling
The single-event kinetic modeling methodology has been developed at the Ghent University by Froment and coworkers. The 'single event' concept links the reactivity of a molecule to its geometry and to a limited number of intrinsic kinetic parameters, which depend only on the type of molecule and the type of reaction. Using the concept of 'single events' is an efficient way of modeling reactions without introducing reductive assumptions about the composition of the feed .
In the single-event methodology, the level of detail is very high, since the transformations on the acid sites concern the elementary steps between carbenium ions. The reactions considered depend on the process. For bifunctional catalysts (isomerization, catalytic reforming, hydrocracking), the metal phase reactions concern the hydrogenation and dehydrogenation steps, while the acid phase reactions concern the protonation of olefins and the deprotonation, Hydride Shift (HS), Methyl Shift (MS), ethyl shift, alkyl shift, intra-ring alkyl shift, Protonated CycloPropane (PCP) isomerization, Protonated CycloButane (PCB) isomerization, and b-scission of carbenium ions. For catalytic cracking, the same acid phase reactions are supplemented with protolytic cracking reactions of paraffins (Fierro et al., 2001(Fierro et al., , 2002 and hydride transfer between olefins and carbenium ions. The network generation is based on a semi-formal network generation algorithm in which reaction rules describe the reaction pathway occurring on the catalyst as a function of the reactant and its configuration. These reaction rules are iteratively used by an algorithm that handles a numerical description of molecules and of reactions and generates in this way the exhaustive network of reactions that includes all reaction intermediates. The full description of the molecule and reaction coding is described in Surla et al. (2011). Figure 12 shows the reaction network for the catalytic reforming of n-hexane to illustrate the level of detail included in this network generation algorithm.
Once the network is generated, a reaction rate is attributed to each reaction. Stability considerations concerning the carbenium ions involved in the reactions, and the further decomposition of the elementary steps involving similar activated complexes allow the writing of reaction rates that are constrained by the concept of 'single events'. For an elementary step, the activation entropy term in the Eyring Results of stochastic and deterministic simulations for the reaction A $ c1 c2 B with c 1 = 0.5 s À1 , c 2 = 0.1 s À1 , starting from X A (0) = 100, X B (0) = 0. equation is separated into its various contributions (Benson, 1976). For a homologous series of reactions in the network, the translational, vibrational and electronic contributions are considered to be the same, but the rotational contributions strongly depend on the symmetry of the molecule (Willems and Froment, 1988a,b). Hence, the entropy contributions due to the symmetry of the molecule and to its chirality are grouped into a factor and set apart from the other entropy contributions, which are grouped into what is termed the intrinsic activation entropy (Eq. 26).
The first factor is a ratio of global symmetry numbers and is termed the number of single events, while the remaining part is termed the intrinsic rate coefficient: where k is the rate coefficient of the elementary step,k is the intrinsic rate coefficient of the elementary step, also termed the single-event rate coefficient, n e is the number of single events.
The intrinsic rate coefficient of the elementary stepk only depends on the nature of the involved ions, and of the type of reaction. Thus, the reaction rates are parameterized by a limited number of parameters . For example, the PCP isomerizations are fully described by four kinetic parameters although hundreds of thousands of PCP isomerizations can occur in complex mixtures. Additional thermodynamics constraints and assumptions allow the further reduction of the number of kinetic parameters Cochegrue et al., 2011).
The generated reaction network can be simulated without any further assumptions, as shown by Verstraete (1997). Since the single-event approach generates all possible reaction surface intermediates, the resulting reaction networks are huge. Network reduction techniques can be advantageously applied on single-event reaction networks in order to reduce the number of kinetic equations of the network and to solve the continuity equations. For some of the processes (catalytic reforming, hydrocracking), the hydrogenation/dehydrogenation equilibrium is easily established on the metal phase. At the same time, experimental data obtained on various feedstocks and at various contact times demonstrated that the paraffins with the same number of carbon atoms and the same number of branches are in thermodynamic equilibrium with each other (Verstraete, 1997;Schweitzer et al., 1999). This leads to the conclusion that isomerization reactions without change in the branching degree (HS, MS, ethyl shift) are very fast compared to isomerization reactions with a change in the number of branches (PCP), cracking reactions and cyclization/ring opening reactions.

Figure 12
Reaction network for the catalytic reforming of n-hexane (without accounting for hydrogenolysis reactions).
Cyclic compounds with the same number of carbon atoms and the same number of branches are also in equilibrium with each other (Verstraete, 1997), indicating that ring contraction or expansion reactions are very fast compared to branching isomerization reactions. With this experimental information, it is possible to reduce the reaction network by means of an a posteriori relumping technique without losing any information , as described by Vynckier and Froment (1991). With this technique, molecules are collected into 'groups of isomers' according to their carbon number and their degree of branching. Inside a group of isomers, the molecules are considered to be in equilibrium because of the very fast interconversion reactions, which allows the determination of the composition within each group. The size of the reduced reaction network is much smaller, since it contains between 10 and 1000 times fewer species and fewer reactions, while the introduction of a set of multiplicative 'lumping coefficients' allows maintaining the same set of fundamental rate parameters for the relumped reactions in the reduced network.
Using the properties of these lumping coefficients and the recurrence relations between the lumping coefficients even allowed writing these a posteriori relumped kinetic models without the full generation of the network (Valéry, 2002;Valéry et al., 2007;Mitsios et al., 2009;Guillaume et al., 2011). This so-called lateral chain decomposition method divides the various components (molecules, intermediates and activated complexes) into lateral chains and activated zones (Fig. 13). Detailed inspection of these various parts allowed the derivation of a set of recursive formulae that calculate the various structural and thermodynamic properties that are needed to determine the lumping coefficients. It should be stressed that this lateral chain decomposition technique is exactly equivalent to the original calculation method based on the explicitly generated reaction network, as long as Benson's group contribution method is used to obtain the necessary thermodynamic data in both approaches (Benson and Cohen, 1993). With this lateral chain decomposition method, much larger reaction networks with higher carbon numbers can be generated (up to approximately 50 to 60 carbon atoms with the current version of the algorithm) very easily and almost instantly. The main advantage of these recursive formulae resides in the fact that application of the single-event modeling methodology to very large reaction networks and very high degrees of branching is highly simplified, both in comparison to the explicit generation of the entire reaction network and to other approaches in the literature . This lateral chain decomposition method is quite flexible and can easily be extended to more general lateral chains, by introducing new insertion patterns and adapting the recursion formulae accordingly.
After relumping, the reduced reaction network for catalytic reforming contains almost 100 lumps and about 500 reactions, while the hydrocracking network contains a few thousand reactions between a few thousand lumps, whose evolution needs to be simulated. In both cases, the reactor simulations are performed by numerically integrating the continuity equations, which are ODE, by means of the LSODE package (Hindmarsch, 1980(Hindmarsch, , 1983Petzold, 1983).
In conclusion, refining processes based on acid or bifunctional catalysts, such as isomerization, catalytic reforming, catalytic cracking, and hydrocracking, involve a tremendous number of species (reactants, intermediates and products) and reactions, both of which may rapidly exceed several thousand for industrial feedstocks. The single-event method represents an extremely elegant approach for the modeling of such processes, and also offers several advantages. First of all, it is based on a fundamental understanding of the chemistry and uses an exhaustive computer-generated reaction network that lists all reactions and reaction intermediates by means of a set of simple rules. Secondly, by applying the single-event concept and by introducing detailed thermodynamic constraints between the rate parameters, the huge reaction networks can be simulated using only a limited number of independent fundamental kinetic parameters that are feed-independent. Since these rate coefficients are independent of the number of carbon atoms in the molecules, it is possible to identify them through studies on model molecules and to predict the behavior of complex feedstocks from experimental data that can be acquired more easily. A third characteristic is that the single-event methodology also comes with a rigorous late-lumping network reduction method that uses experimentally-verified and reasonable assumptions on the chemical equilibria to regroup the species at equilibrium into 'groups of isomers' or lumps. Under these assumptions, the substantially reduced relumped reaction network is strictly equivalent to the detailed network, while the use of lumping coefficients allows maintaining the same fundamental kinetic parameters to simulate the complete system without any loss of information. Finally, a lateral chain decomposition method has been developed for the single-event methodology that allows the direct calculation of the lumping coefficients for the relumped reaction network by means of recursive formulae. This allows the application of the single-event methodology to very large reaction networks and very high degrees of branching without explicitly generating the entire reaction network.

Network Reduction Techniques for Single-Event Kinetic Modeling of Olefin Oligomerization
At IFPEN, the single-event methodology was also applied to the oligomerization of light olefins. Two additional problems arose, however. The first one concerns the generation of the reaction network. Unlike the reaction networks for catalytic reforming, hydrocracking, or catalytic cracking, the reaction network for olefin oligomerization will theoretically never end, since light olefins can be added onto any olefin of any size. Hence, no exhaustive reaction network can be generated and, in practice, the reaction network is therefore limited by specifying a maximum number of carbon atoms. The second problem concerns the a posteriori relumping that was used previously. The experimental data for olefin oligomerization shows that thermodynamic equilibrium is not reached within the molecules with the same carbon number and the same degree of branching. It is therefore not possible to reduce the detailed reaction network to a manageable number of ODE to allow a reactor simulation to occur within a reasonable time. However, the concept of 'single events' still allows the description of the elementary steps with a small number of intrinsic parameters .
To resolve these two problems (no exhaustive reaction network and no tractable set of differential equations), stochastic simulation methods were applied. Several approaches were compared to investigate various network limitation methods. First, the full reference reaction network was generated by allowing species with up to 16 carbon atoms using a deterministic network generation program, which exhaustively enumerates all species and elementary steps. The resulting reaction network contained 279,045 species and 1,729,586 elementary steps. Every additional carbon number approximately leads to a three-fold increase in the number of species and in the number of reactions. This C 16 reaction network was simulated twice, once using a deterministic numerical integration using LSODE and once by applying Gillespie's SSA. As expected, both approaches gave the same results. Reaction pathway and activated complex (a) for a b scission step and (b) for a PCP isomerization step. L 1 to L 5 are substituents which are part of the activated zone, while A and B are lateral chains, which do not participate in the elementary step .
A first network reduction approach consisted of limiting the deterministic network generation algorithm by a rulebased approach, such as putting an upper limit on the number of species and/or reactions. In this example, the network generation algorithm was stopped once the first 100,000 species (about 1/3 of the full reaction network) were generated. The resulting network is the Deterministic Limited Reaction Network (DLRN).
A second network reduction approach consisted of constructing a stochastically-generated, but limited reaction network. As in the previous methods, the reactor simulation is performed by the SSA, but starting from an 'empty' reaction network. The algorithm therefore has to generate all the possible reactions for the components present in the feed. Each of these reactions is now stored into the 'initial' reaction network. The SSA will then select one of these reactions and create the product species. If the product species are new to the current simulation, their possible reactions are generated and stored in the reaction network, while the SSA selects the next reaction. This procedure is repeated until the maximum storage limit of the reaction network is attained, in our case the first 100,000 species. From this point onwards, the simulation is continued on the stored network. For the next simulation, a new network is generated and the new simulation is based on this new network. In conclusion, this approach integrates the network generation into the SSA. Hence, the network is explored along the most probable reaction routes. Moreover, when several simulations are performed on the same feed, the average network combines all random networks and is therefore larger than each individual network. The strength of this network reduction technique resides in the fact that the exploration of the full reaction network is driven by probability considerations (exploring the more probable routes) and by averaging (combining several limited reaction networks). Figure 14 schematically depicts the two network reduction techniques in comparison to the reference reaction network. Shahrouzi (2010) investigated the impact of these two network reduction methods on the simulation results. All simulations were performed using the SSA, and the Deterministic Full Reaction Network (DFRN) was used as a reference. The stochastic simulation of the DFRN represents the reference simulation, and a single simulation starting from 100,000 feed molecules took approximately 56.6 hours of CPU time on a dual-core CPU running at 2.33 GHz (Shahrouzi, 2010). The first rule-based network reduction technique provided a DLRN that contained the first 100,000 species and their reactions. During its stochastic simulation, the DLRN was loaded into memory, leading to a memory requirement that was approximately 10 times lower. A single stochastic simulation took approximately 4.5 hours of CPU time, which is 12.5 times faster than the stochastic simulation of the DFRN reference network. The second network reduction technique created a Stochastically Limited Reaction Network (SLRN). During the stochastic simulation, the SLRN grows into memory until 100,000 species and their reactions are stored. Again, the memory requirements are approximately 10 times lower than for the DFRN. The stochastic simulation itself also took approximately 4.5 hours of CPU time. The results of the three simulations are compared in Figure 15. For the first reduction technique, the results of the DLRN are quite different compared to those for the DFRN, not only for the reaction intermediates (the carbenium ions), but also for the final products (the olefins). The quality of the simulation with the deterministic reduced reaction networks was shown to highly depend on the reduction rules that are used during the network generation (total number of species, total number of reactions, rank of the products, reaction (free) enthalpies, presumed reaction rate, etc.). Moreover, by selecting a maximum number of species, the resulting network also depends on the processing order of the network generation algorithm. For the second reduction technique, the results for the SLRN agree well with the simulations of the DFRN. A very good agreement is observed for the final products, even though some discrepancies arise for the high carbonnumber carbenium ion intermediates (Shahrouzi, 2010). This illustrates that the SLRN allows the capturing of all the features of the full deterministic reaction network with a ten-fold reduction in computer memory and in CPU time. This is mainly due to the fact that the full reaction network is explored based on reaction probability considerations along the more probable reaction pathways. The SLRN network reduction technique also has an additional advantage: when several stochastic simulations are performed, the use of a new SLRN during each simulation also allows the coverage of a bigger network than if the first SLRN was re-used in all simulations. Hence, an alternative network reduction technique can also be suggested here: if the size of the cumulative reaction network remains acceptable, it can be exported and re-used with deterministic simulation tools. This approach would allow to stochastically generate reaction networks based on the reaction probabilities.
In the examples above, the chosen implementation still requires the provision of memory space to store the reaction network. An alternative technique consists of performing stochastic simulations without a pre-generated network by using a Markovian approach. Consider a population of molecules without a predefined reaction network. If one can reconstruct for each molecule the different reactions that can occur and their propensity function, the reactivity of the ensemble of molecules can be stored via a global propensity function. If this information can be updated after every reaction event, the evolution of the population of molecules no longer requires the storage of a huge pre-defined reaction network. This approach is particularly interesting for processes where the initial mixture is limited to a few different molecules because the initial propensity functions are easily computed. Shahrouzi (2010) applied this approach to olefin oligomerization reactions. The main benefit of this Markovian simulation technique resides in the exploration of different parts of a theoretically quasi-infinite reaction network without having to store it.

Full Stochastic Kinetic Modeling of Hydrotreating and Hydroconversion Processes
For refining processes that treat complex mixtures, such as heavy oil fractions, a full stochastic kinetic modeling approach was developed. In the proposed two-step kinetic modeling approach (Fig. 16), the first step, the 'composition modeling step', overcomes the lack of molecular detail of complex petroleum fractions by using a set of molecules that are carefully selected based on available analyses. The available analytical information, which is generally relatively limited, is transformed into a set of molecules that has the same properties as the petroleum fraction. This transformation is carried out through the above-described SR -Entropy Maximization (SR-REM) algorithm, which stochastically assembles various structural blocks before adjusting the molar fractions of the resulting molecules. Once the composition Mass percent product distributions of (a) olefins and (b) carbenium ions for the oligomerization of C4 olefins up to C16 at 67% of conversion. Feed composition is 50% iso-butene, 25% 1-butene and 25% 2-butene. The initial molecular population is 100,000. In the two limited reaction networks, the PCP steps are not considered and the protonation/deprotonation equilibrium is also considered in the DLRN.
of the feedstock is adequately represented, the second step, called the 'reaction modeling step', simulates the effect of the reactions on this set of molecules by means of a kinetic MC method. For this simulation, the above-described SSA was selected, thereby retaining the molecular detail of the process throughout the entire reaction simulation. This approach is based on a Markovian process, which tracks the transformation of a discrete population of molecules, event by event.
Each event describes the transition from one state to another as the result of a chemical reaction. To perform the reactor simulations, the Markovian technique does not need the full reaction network, but only a list of reaction rules that define the possible reactions for each molecule. The probability of each reaction is determined depending on the reactant molecules and the reaction rate constant. By tracking the evolution of the population of molecules, the reaction network is generated 'on-the-fly'. Overall, this two-step methodology therefore models both the feedstock composition and the process reactions at a molecular level. This modeling methodology was first developed and validated on somewhat simpler gas oil feeds (de Oliveira et al., 2012;López Abelairas et al., 2016). The hydroprocessing of industrial LCO gas oils was simulated and compared to experiments carried out in an isothermal fixed-bed up-flow reactor containing 200 mL of a sulfided commercial NiMo/Al 2 O 3 catalyst (López Garcia, 2000;López García et al., 2003) and operating at 320°C and 70 bar. The LCO gas oils were characterized through elemental analysis (carbon, hydrogen, sulfur), MS, 13 C RMN and simulated distillation. The LCO gas oils were represented by 5000 molecules which were reconstructed from these analyses by using the probability distribution functions of the structural attributes. To simulate the hydrogenation of the LCO gas oils, a limited set of reaction types needs to be considered: hydrogenation of aromatic rings, dehydrogenation of cyclohexane rings, and hydrodesulfurization of sulfur species. The evolution of the set of 5000 molecules was assessed by performing 50 stochastic simulations and averaging the properties of the mixture.
For the LCO gas oils, the content of the aromatic families of the reconstructed feed mixture shows a good agreement with the analytical values. The same is true for the other properties and illustrates the representativeness of the generated set of 5000 molecules, thus proving that the LCO gas oil is well reconstructed. The simulations of the LCO hydrotreating process showed that the overall properties of the effluent were well predicted for various feedstocks and operating conditions (Fig. 17). The prediction of the hydrogenation of aromatic compounds is quite accurate. The results illustrate that the methodology is able to provide a good prediction of the behavior of hydrotreating process behavior on complex feeds with different molecular compositions. The methodology is also able to predict molecular properties of the effluent that are not accessible in traditional kinetic models.
To illustrate the full potential of the method, the abovedescribed methodology was also applied to the hydroconversion of an Athabasca VR. Elemental analysis (C, H, S, N, O), average molecular weight, specific gravity, simulated distillation, SARA separation and 13 C NMR were used to characterize the VR feed analytically (de Oliveira et al., 2013a). The conversion of the VR was again simulated using the SSA. For each molecule, an algorithm makes an inventory of the possible reactions of the molecule, and calculates the corresponding rate coefficients by means of appropriate QSRC for the hydrogenation-dehydrogenation equilibrium constants, the hydrogenation and hydrodesulfurization rate constants, the ring dealkylation, and the ring opening rate constants: ln K eq À Á In the above equations, the reaction enthalpies were calculated by means of Benson's group contribution method; while all activation energies were directly taken from literature: the hydrogenation and desulfurization activation energies were taken from López García (2000), and the dealkylation activation energies from taken from Gauthier et al. (2007) and Danial-Fortain (2010 The VR was represented by a set of 5000 molecules in order to ensure an optimal balance between the required CPU time and the accuracy of the feedstock representation. The mixture properties obtained after the SR step show a good agreement with most of the analytical data. These results illustrate that the reconstructed feedstock composed of 5000 molecules represents quite accurately the Athabasca VR, although VR fractions contain more complex molecules than LCO gas oils. For the hydroconversion of VR fractions, the evolution of the synthetic mixture of 5000 molecules was Comparison between experimental and simulated effluent weight fractions for the hydrocarbon families (SAT -Saturates, MONO -Monoaromatics, DI -Diaromatics, TRI+ -Polyaromatics with at least three aromatic cycles) and sulfur families (BT -Benzothiophenes, DBT -Dibenzothiophenes) during hydrotreating of two different LCO gas oils. obtained by performing 100 stochastic simulations and averaging the properties of the populations of molecules. The overall conversion is well predicted and the overall product yields and product property predictions follow the correct trends with the operating conditions (de Oliveira et al., 2014). The reactor simulations compared favorably (Fig. 18) to experiments carried out in a batch reactor at 410°C and 150 bar . The effect of temperature was validated by comparing the simulation results to experimental data on a Ural VR at 395°C and 410°C.
In conclusion, this two-step kinetic modeling strategy has been applied to the hydroprocessing of oil fractions: hydrotreating of LCO and hydroconversion of VR. The simulation results were compared to the experimental data, illustrating that the combined 'molecular reconstruction stochastic simulation' approach nicely simulates the process performances. The proposed approach has several advantages. First of all, the complex feedstock is represented by means of a synthetic mixture of molecules. This representation allows the retention of a molecular description of the reaction system throughout the entire reactor simulation. Secondly, the Markovian approach of the SSA allows the tracking of the evolution of the population event by event, without a pre-defined reaction network. To this end, only a limited number of reaction types need to be defined, and the reaction network is generated 'on-the-fly' as the reactions proceed. Finally, by following the modifications to the synthetic mixture throughout the reactor simulation, the evolution of all product yields and product properties can be assessed by evaluating the yields, composition and characteristics of sub-groups of molecules. This means that this methodology is also able to predict various molecular properties of the effluent that are not accessible in traditional lumped kinetic models. Comparison between experimental data and simulation results for (a) residue conversion, (b) sulfur removal, and (c) product yields during hydroconversion of Ural VR at 395°C and 410°C.

CONCLUSIONS
Chemical kinetics studies the rates of chemical reactions and addresses how the reaction rates depend on concentrations, temperature, pH, solvents, nature of the catalysts, to mention a few reaction conditions. In this review paper, an overview was given of kinetic modeling techniques for complex processes. When systems with several hundred or several thousand components, reaction intermediates, reactions, and elementary steps are being considered, two mean modeling strategies can applied: a lumping strategy or a detailed modeling strategy. In the first strategy, the chemical complexity is reduced by grouping chemical compounds into families or 'lumps' and grouping individual reactions into lumped reactions between these families. In this case, a mean-field approach together with a RDS approximation is generally used. Two examples of lumped kinetic models (atmospheric gasoil hydrotreating and residue hydroprocessing) developed at IFPEN have been presented.
The largest part of this review described detailed kinetic modeling approaches. In these detailed modeling strategies, the chemical detail is generally retained at the molecular level, or even at the level of the reaction intermediates throughout the entire reactor simulation, which allows the creation of the feedstock-independent models. This implies, however, that a molecular description of the feedstock can be provided to the model. When advanced analytical techniques are not able to provide a detailed characterization of the feedstock, a molecular representation of the feedstock can be created through molecular reconstruction algorithms, also called composition modeling techniques. Another hurdle to be taken concerns the generation of large size reaction networks. Indeed, as the number of species and the number of reactions increases almost exponentially with increasing carbon number, the reaction network becomes so large that it needs to be computer generated and managed along the simulation. In this review, deterministic and stochastic network generation and simulation approaches have been discussed, together with examples of kinetic models developed at IFPEN for several refining process applications.
All these examples given illustrate that one can acquire a more fundamental understanding of very complex processes by applying advanced techniques for composition, kinetic and reactor modeling.