Morphological Component Analysis for the Inpainting of Grazing Incidence X-Ray Diffraction Images Used for the Structural Characterization of Thin Films

Morphological Component Analysis for the Inpainting of Grazing Incidence X-Ray Diffraction Images Used for the Structural Characterization of Thin Films — Grazing Incidence X-ray Diffraction (GIXD) is a widely used characterization technique, applied for the investigation of the structure of thin films. As far as organic films are concerned, the confinement of the film to the D o s s i e r Advances in Signal Processing and Image Analysis for Physico-Chemical, Analytical Chemistry and Chemical Sensing Progrès en traitement des signaux et analyse des images pour les analyses physico-chimiques et la détection chimique

substrate results in anisotropic 2-dimensional GIXD patterns, such those observed for polythiophenebased films, which are used as active layers in photovoltaic applications. Potential malfunctions of the detectors utilized may distort the quality of the acquired images, affecting thus the analysis process and the structural information derived. Motivated by the success of Morphological Component Analysis (MCA) in image processing, we tackle in this study the problem of recovering the missing information in GIXD images due to potential detector's malfunction. First, we show that the geometrical structures which are present in the GIXD images can be represented sparsely by means of a combination of over-complete transforms, namely, the curvelet and the undecimated wavelet transform, resulting in a simple and compact description of their inherent information content. Then, the missing information is recovered by applying MCA in an inpainting framework, by exploiting the sparse representation of GIXD data in these two over-complete transform domains. The experimental evaluation shows that the proposed approach is highly efficient in recovering the missing information in the form of either randomly burned pixels, or whole burned rows, even at the order of 50% of the total number of pixels. Thus, our approach can be applied for healing any potential problems related to detector performance during acquisition, which is of high importance in synchrotron-based experiments, since the beamtime allocated to users is extremely limited and any technical malfunction could be detrimental for the course of the experimental project. Moreover, the non-necessity of long acquisition times or repeating measurements, which stems from our results adds extra value to the proposed approach.

INTRODUCTION
Since the introduction of X-ray diffraction for the investigation of the unit cell of crystals in the beginning of the 20th century, X-rays turned to be an indispensable characterization tool that probes the structure of a wealth of materials, ranging from inorganic crystals and powders to organic small molecules and polymers, up to proteins and other biological samples [1,2]. During the last 20 years, the need for characterizing the structure of 2-dimensional (2-D) systems, such as thin films that are of particular interest in microelectronics and other nanosciences, led to the emergence of Grazing Incidence X-ray Diffraction (GIXD) [3]. GIXD exploits the principles of X-ray diffraction, however the X-ray beam probes the sample at a very small incident angle, typically below 1 • , allowing, thus, an effective increase of the penetration depth and, consequently, the interaction of X-rays with a bigger part of the sample. On top of that, the physical confinement of one side of the film to the substrate may result in preferentially oriented scatterers that give rise to non-isotropic diffraction patterns. Such patterns are recorded when studying organic thin films, like polythiophene-based films, which are extensively used in organic photovoltaics [4].
In order to acquire and exploit all information contained in the anisotropic diffraction patterns, 2-D detectors are utilized. Data acquisition is followed by sophisticated data analysis, which premises the high quality of the recorded images. For this, synchrotron-based experiments are preferred, since the high X-ray flux provided in a synchrotron facility allows for increased signal-to-noise ratio. However, due to large demand for synchrotron beamtime, the time allocated to users is extremely limited and the success of the experiments is imperative. In the quest for an alternative way of recovering information that is partially recorded in the GIXD images due to technical problems during the experiment or less acquisition time than the optimum, we turn to image processing theories.
In image processing, finding an efficient and compact representation of the data under consideration is of major importance in several distinct tasks, such as, compression, denoising and restoration, to name a few. In the quest for a suitable transform, sparsity of the representation was recognized as a key requirement in seeking simplifying operations [5,6]. Specifically, the design of over-complete redundant representations is now at the core of many stateof-the-art algorithms used in image compression [7], denoising [8], deconvolution [9] and restoration [10]. In each case, an image is represented as a linear combination of atoms from a dictionary, where the number of atoms is much larger than the original image dimension. Due to the redundancy, there exist numerous ways to represent the image, with our preference being towards the sparsest one, that is, with the fewest non-zero components as being the simplest.
Focusing on images, another important task is to decompose the data into elementary building blocks. The successful separation of the image content is crucial for its effective analysis, as well as for tasks, such as, image enhancement, compression and synthesis. Numerous methods have been proposed for the solution of the image separation problem, especially in the frameworks of Blind Source Separation (BSS) [11] and Independent Component Analysis (ICA) [12]. In addition, the need to recognize structures of different sizes in a given image, makes it impossible to define a priori an optimal resolution for analyzing it. Multiresolution decomposition was introduced as a simple hierarchical framework for interpreting the image information, where at different resolutions the details of an image generally characterize different physical structures.
In this direction, multiscale methods have become popular image processing tools in the last couple of decades, especially with the development of wavelets [13]. While the Discrete Wavelet Transform (DWT) was implemented successfully in image compression, the results were far from optimal for other image processing tasks, such as, deconvolution, detection and filtering. The dual-tree complex wavelet transform [14] was introduced as a valuable enhancement of the traditional DWT, which is nearly shift invariant and, in higher dimensions, it is characterized by directional selectivity. One of the reasons to focus on the design of new redundant representations was the need to preserve the shift-invariance property, while also approximating more closely the continuous analogue.
For this purpose, several novel tailored multiscale and multidirectional redundant transforms have been introduced in the literature, including, among others, the Undecimated Discrete Wavelet Transform (UDWT) [15], the curvelet transform [16], the contourlet transform [17] and the bandlet transform [18]. Most importantly, each of these transforms adapts to specific characteristics and structures in a given image, thus, yielding highly sparse representations in the presence of the corresponding structures, in a nonlinear approximation scheme. For instance, sparse approximations of piecewise smooth images with point singularities are obtained using the UDWT, which is efficient in capturing roughly isotropic features. However, this is no longer optimal in case of piecewise smooth images with singularities along smooth curves or edges. Such images are approximated more efficiently using the curvelet transform, which is highly anisotropic and thus exhibits high directional selectivity by defining an adapted multiscale geometry.
Apart from decomposing an image in terms of the physical size and orientation of its structural content via a multiscale transform, natural images are often considered to consist of homogeneous regions and oscillating patterns (e.g., texture and noise). In this later case, it is also of high importance to be able to decompose an image into its constituent components. Several methods [19][20][21][22] have been introduced for decomposing a given image into a component with bounded variation, which holds the geometrical information, and an oscillating component, which corresponds to the textural information. A recent work [23] generalized the previous approaches providing a method for decomposing an image in more than two components, while being also able to handle data corrupted with a linear operator and a non-necessarily Gaussian noise.
Tackling the decomposition problem from a different perspective, the use of sparsity as a desired property to rely on was recognized earlier [24,25]. In this framework, Morphological Component Analysis (MCA) [26] is a recent novel technique, which exploits the sparse representation of structured data in large, generally over-complete, transform domains (or dictionaries) to separate them in a set of distinct components based on their difference in morphology. The method is based on the assumption that for each morphological feature to be separated, there exists a suitable transform that enables its reconstruction via a sparse representation, while this transform is highly inefficient in representing the other morphological features. For instance, it has been shown that MCA can be used to separate the texture from the piecewise smooth component of a given image, by noticing that the former is characterized well using local cosine functions, while the latter may be well represented using curvelets [27]. The MCA algorithm has been employed successfully for the analysis of data in several areas, such as, video analysis [28], astrophysics [29] and medical imaging [30,31], while it was extended recently in the framework of "image inpainting" [32], enabling the treatment of problems where parts of the image are missing or corrupted.
Motivated by the success of MCA in signal and image processing, the purpose of this study is to exploit the effectiveness of modern over-complete transforms and extend the applicability of MCA to the analysis of 2-D GIXD data. To the best of our knowledge, this is the first study that bridges the gap between the analysis of diffraction data and state-ofthe-art image processing techniques. More specifically, first we show that the geometrical structures, which are present in GIXD images for the specific type of thin films considered in this study, can be represented efficiently by means of a combination of over-complete transforms, namely, the UDWT and the curvelet transform, resulting in a simple and compact description of their inherent structures. Then, we tackle the important problem of recovering the missing information in GIXD images, e.g., due to potential malfunction of the detectors, by applying MCA in an inpainting framework.
The experimental evaluation using high-resolution GIXD images shows that the proposed approach is highly efficient in recovering missing information in the form of either randomly burned pixels or whole burned rows, even at the order of 50% of the total number of pixels (which is anyway an extreme practical scenario). This missing information may inhibit the post-processing of GIXD images and data evaluation. Data analysis is mainly based on reduction of the 2-D images into 1-D intensity patterns. This is done either by integrating the diffracted intensity (throughout the whole image or on specific sectors) or by plotting the intensity across the meridian and/or the horizon. In case of missing information due to burned pixels, or inaccurate statistics related to a small number of measurements due to reduced acquisition times, incomplete intensity 1-D plots will be derived. This entails the danger of allowing for data misinterpretation and derivation of false scientific outcomes, which are highly sensitive to the form Schematic of the GIXD setup. a) GIXD setup: α is the incidence angle, 2θ is the scattering angle. b) Directions of the various linecuts that can be performed for the evaluation of GIXD data.
of the intensity patterns. The added value of our approach stems, thus, from its application in healing potential problems related to detector performance during image recording. Additionally, MCA weakens the necessity for long acquisition times or repeating experiments. All these merits are of high importance in synchrotron-based experiments, since the beamtime allocated to users is extremely limited and any technical malfunction could be detrimental for the course of the experimental project. The rest of the paper is organized as follows: Section 1 provides the principles of GIXD technique, experimental details on image acquisition and a brief description of the thin films under study. Section 2 overviews the basic concepts of MCA applied in the framework of inpainting. Section 3 studies the efficiency of a combination of overcomplete redundant transforms, namely, the UDWT and the curvelet transform, in decomposing the given GIXD images into their constituent morphological components. Then, an evaluation of the performance of MCA and its comparison with the performance of classical inpainting methods, in recovering the missing information in the form of individually burned pixels, or whole burned lines, distributed uniformly at random over the whole image area, is performed. Finally, we conclude and give directions for future work.

GIXD MEASUREMENTS AND MATERIALS UNDER STUDY
GIXD experiments were performed on the Dutch-Belgian Beamline (DUBBLE CIG), station BM26B, at the European Synchrotron Radiation Facility (ESRF), Grenoble, France.
In GIXD, the X-ray beam probes the sample at the grazing geometry and the diffracted intensity is recorded by a 2-D position sensitive detector, which is placed after the sample typically at a distance of around 20-40 cm (Fig. 1a). The energy of the X-rays was 12 eV and the angle of incidence was set at 0.15 • . The diffracted intensity was recorded by a Frelon CCD camera and it was normalized by the incident photon flux and the acquisition time. Each pixel monitors the intensity as a function of the scattering vector q that is defined with respect to the center of the incident beam and has a magnitude of q = (4π/λ) sin θ, where 2θ is the scattering angle and λ is the wavelength of the X-ray beam.
In order to analyze the GIXD data several "linecuts" are performed. Usually, the 2-D images are radially averaged around the centre of the primary beam, which gives the 1-D plot of the intensity as a function of q. In a second approach, linecuts across the meridian and/or the horizon are extracted, which provide the intensity plots across the out-of-plane (q z ) and in-plane (q xy ) directions, respectively (Fig. 1b). Finally, azimuthal scans can be performed around a q-range of special interest, herein at around q = 0.37Å −1 (the highest intensity ring apparent in the test images shown in Fig. 2). In this case, the diffracted intensity is plotted as a function of the polar angle χ, which is defined with respect to the normal to the substrate (Fig. 1b).
The GIXD images presented herein were collected from polythiophene-based films. Poly(3-hexylthiophene), P3HT, is a semi-conducting polymer that is widely used as a donor material for the fabrication of organic photovoltaic devices. For our study, P3HT was blended with [6,6]phenyl-C 61 -butyric acid methyl ester, PCBM, a fullerenebased organic small molecule, in equal masses in chlorobenzene and 100 nm thick films were spin-coated on indium  tin oxide, ITO, substrates. Two films were prepared and annealed at 160 • C for 10 min (image X1) and 20 min (image X2), respectively, in order to induce small morphological changes that will result in small changes in the two GIXD images. We opt to use polythiophene-based films as a case study due to the non-isotropic features that are apparent in their GIXD patterns. We note that no discussion on the structural characteristics of the two films will be presented in this work, since our objective is to stress the potential use of image processing techniques in physicochemical applications.

IMAGE INPAINTING USING MCA
In this section, the main principles of MCA are introduced in the framework of image inpainting. Since the performance of the MCA algorithm relies and depends highly, on the degree of sparsity achieved for the analyzed data, we start by introducing briefly the concept of sparsity in transforms. Then, the process of decomposing an image in a set of distinct morphological components by exploiting sparse representations will be described, along with the extension of MCA as a solution to the image inpainting problem.

Sparse Recovery in a Transform Domain
In the following, we consider for convenience the case of square N × N images, although the proposed approach is extended straightforwardly in the general non-square case. Let X ∈ R N×N be the given image, T {·} denote a sparsifying transformation and c ∈ R L be the vector of transform coefficients, that is, c = T {X}. Although T {·} can be linear or non-linear in the general case, however, for computational and implementation purposes, the linear transformations are highly promoted for carrying out several signal and image processing tasks (e.g., restoration, denoising, deconvolution). Thus, in the subsequent analysis the linear case is considered. We note the following two remarks concerning the sparsifying transform: -in case of linearity, both the forward, T {·}, and the inverse transform, T −1 {·}, can be expressed in matrix form, while the transform coefficients are obtained by means of simple matrix-vector multiplications; -in case of an over-complete redundant representation the number of transform coefficients is larger than the original image dimension, that is, L > N 2 .
An image X is said to be "K-sparse" in the transform domain T if the coefficient vector c has exactly K L nonzero components. However, in practice a natural image X is not strictly sparse but "compressible", which means that the magnitude of the re-ordered transform coefficients decays at a power law: where c r i denotes the i-th sorted coefficient, C is a real positive constant and δ > 0 controls the rate of decay. Thanks to the rapid decay of their coefficients, compressible images are well-approximated by K-sparse images, that is, by keeping the K most significant (largest magnitude) coefficients c r i , i = 1, . . . , K. Following a typical transform codingbased approach, a way to choose the value of K is given by keeping the K transform coefficients, which contain a predetermined percentage of their total energy. In practice, the value of this percentage is set in a heuristic way. In contrast to a complete transform basis, where X has an exact representation, an over-complete redundant transform yields several exact representations. In the later case, these representations are not equally interesting in terms of modeling and feature extraction. In particular, the representation of X by means of highly sparse coefficient vectors c is promoted, since it usually leads to a more concise and possibly more interpretable representation of X.
However, selecting the smallest subset of transform basis functions (also called "atoms") from a large redundant set, which will be combined linearly to reproduce the salient features of a given image, is a hard combinatorial problem. In a formal way, the requirement for obtaining the sparsest representation c for a given image X in a transform domain T is written as follows: where c 0 denotes the 0 -pseudo-norm (1) defined as the number of non-zero elements of the vector c. The difficulty in addressing (2) is that this optimization problem is highly non-smooth and non-convex, while it has been also proved to be NP-hard in terms of computational complexity [33].
On the other hand, the relaxation of (2) by replacing the 0 -pseudo-norm with the 1 -norm reduces to a linear program, and hence it can be solved in polynomial time: where the 1 -norm of c is given by Moreover, it was shown that under certain conditions, the solutions of (2) and (3) are identical [34,35].
Several pursuit algorithms with empirical success have been proposed for the solution of (3), among them, the greedy Matching Pursuit (MP) [36], the Basis Pursuit (BP) [37] and their variants [43]. Nevertheless, in applications involving large data sets, such as, the high-resolution GIXD images we deal with, MP or BP algorithms are computationally intense.
On the other hand, as discussed in [27], a single basis is often not well-adapted to large classes of highly structured data such as "natural images". Furthermore, over the past years, new tools have emerged from modern computational harmonic analysis, such as, wavelets [5], ridgelets [38] and curvelets [39], to name a few. It is quite tempting to combine several representations to build a larger dictionary of waveforms that will enable the sparse representation of larger classes of signals.
Morphological Component Analysis (MCA) [26] was introduced recently aiming at decomposing signals in (generally overcomplete) dictionaries made of a union of bases. MCA serves as a fast alternative to other algorithms in the sparsity literature, like the ones mentioned above, where the solution of the corresponding minimization problem necessitates to deal with unknowns, that is, the sparse coefficient vectors, living in a high-dimensional space (getting larger as the dictionaries become more redundant). In contrast, MCA solves a minimization problem in terms of the morphologically distinct components directly, whose dimension (1) Despite the fact that 0 is a pseudo-norm, the improper term "norm" is often used, too.
is less than or equal to the dimension of the corresponding transformed version. The following sections introduce in brief the main concepts of MCA, as well as its extension in the framework of image inpainting, to be exploited for recovering the missing information in our GIXD data due to detectors malfunction.

Overview of MCA
In the subsequent analysis, we assume that a given image X is modeled as a linear combination of S images (components) with different morphologies, X = S s=1 X s . The fundamental assumption of MCA is morphological diversity, which relies on the sparsity of those morphological components in specific bases. In other words, MCA is based on the existence of a set of transforms (or a dictionary of bases) {T 1 , . . . , T S } such that the s-th component X s is sparsely represented in T s , while its representation in the other transform domains T s , s s, is not sparse. This is ensured with high probability by an increased incoherence between the distinct dictionaries. The problem to be solved is the separation of the linear mixture X into its constituent morphological components X s , relying on the discriminative power of the distinct transforms T s , s = 1, . . . , S .
By extending (3) in the multi-component case, the problem of recovering the corresponding sparse coefficient vectors {c s ∈ R L s } s=1, ..., S is expressed as follows: Notice also that, generally, the coefficient vectors c s can be of different dimension L s depending on the corresponding transform. For this reason, we also keep denoting a linear transform as a general operator T s , instead of using a matrix notation T s , to avoid any inconvenience due to the potentially different dimensions among the transform coefficient vectors corresponding to distinct morphological components.
The solution of (4) should be expected to give a truly sparse decomposition if the image X is indeed composed solely of the morphological components X s , s = 1, . . . , S , and thus it can actually be represented sparsely in terms of the transforms T s . However, in a real-world scenario the assumption of an exact decomposition does not hold in general. For this purpose, we compensate by reformulating the constrained optimization problem (4) as an unconstrained regularized one. Moreover, the increased memory and computational costs when we work directly with the coefficient vectors c s , whose dimensions can be much higher than the dimension of the original image in case of highly redundant transforms, are resolved by solving the optimization problem using the morphological components X s as the unknowns. The above two considerations result in the following minimization problem solved by the MCA algorithm [26]: where τ > 0 is a regularization parameter, which controls the amount of distortion in representing X in terms of the morphological components X s . This problem has a quadratic programming structure, for which efficient solvers exist, with the "soft-thresholding" being among them as it is described in the following section.

Extending MCA for Image Inpainting
The MCA algorithm can be easily extended in the framework of image inpainting, where the main requirement is the preservation of discontinuities (e.g., edges and textures). In the following, we focus on the case where part of the original image information content is missing, in the form of occluded ("burned") pixels. Let M be a "binary mask", where zeros indicate that the corresponding pixels in the original image X have been occluded, while ones indicate valid pixels. Then, the observed image captured by the detector can be expressed as: where X d is the detected, possibly corrupted, image and • denotes element-by-element multiplication (that is, for two matrices A, B of the same dimensions we have that The standard optimization problem (5) solved by the MCA algorithm can be modified easily to account for inpainting the missing information as follows: Notice that the inclusion of the binary mask M in the above objective function to be minimized prevents the sparse model we try to build from attempting to consider the invalid (missing) data. A solution to the optimization problem (7) can be obtained by employing the same iterative thresholding strategy as in the MCA algorithm. In particular, the "block-coordinate relaxation" method [41] is used. It is a fast numerical technique, which requires only the use of matrix-vector multiplications due to the linear assumption for the sparsifying transformations.
Regarding the choice of the regularization parameter τ, which is also employed by the soft-thresholding operator used in the solution of (7), one solution is based on the following observation. At the early stages of the algorithm's execution, the estimation of the individual morphological components may be inaccurate because of the missing data.
To overcome this drawback, one has to start by considering a large value of τ in order to favor the data fidelity term (second term in (7)). The appropriate initialization of τ is done in a rather heuristic way, with its value affecting the speed of convergence. In the present implementation, it is expressed as the minimum of maximal amplitude coefficients of the recorded image in each sparsifying transform domain. Then, the value of τ is decreased monotonically (e.g., according to a linear or an exponential strategy as we employ here) in order to favor the sparsity-enforcing term (first term in (7)). Moreover, τ is updated as a function of the estimated noise standard deviation (e.g., by employing a "Median Absolute Deviation" (MAD) estimator) so as to reject noise. Algorithm 1 (Tab. 1) summarizes the main steps of MCA and gives the expressions for the regularization parameter τ and its updating rule for solving the inpainting problem expressed by (7).
Concerning the convergence of this algorithm, MCA for image inpainting is based on an iterative thresholding process, as in the original MCA algorithm which has been proven to converge [26,42]. As far as inpainting is concerned, the presence of the mask makes things not to be straightforward anymore. In [32], a sketch of a proof for the convergence of MCA in case of inpainting, along with the role and the effect of the mask, is provided. However, we emphasize that convergence here means that the sequence of iterates for the recovery of the individual morphological components converges but there is not a guarantee on the properties of the minimizer with respect to the true signal. In fact, the recovery guarantees of MCA for inpainting is a very important theoretical problem that remains largely open.

EXPERIMENTAL EVALUATION
In this section, we evaluate the efficiency of the MCA framework for the recovery of missing information in highresolution GIXD images, and subsequently, for the analysis of 1-D intensity patterns (linecuts), which are of interest for characterizing the structure of thin films. More specifically, our test set consists of two 2 048 × 2 048 GIXD images, X1 and X2, shown in Figure 2, whose acquisition details were described in Section 1.
As already stated, the analysis of GIXD data is based on the reduction of the 2-D images into 1-D intensity patterns. From the several linecuts mentioned in Section 1, herein we focus on three distinct types of linecuts (Fig. 1b): • Intensity as a function of the scattering vector q (I versus q); • Intensity as a function of the out-of-plane component of the scattering vector (I versus q z ); • Intensity as a function of the polar angle (I versus χ). It is noted that in the subsequent experimental evaluation, we ignore the part at the bottom of the GIXD images that corresponds to the shadow of the film substrate, since the where λ max = min max s=1, ..., S |T s {X d }|, λ stop a constant depending on the noise level (set to a small value e.g., 10 −6 in the noiseless case and between 3 and 5 for noisy images), σ n the estimated noise standard deviation (e.g., using MAD) while τ (t) > τ min do Execute the following iteration to estimate each component X s at iteration t by assuming all the others {X k } S k=1, k s are fixed: for s = 1, . . . , S do • update residual: • estimate current transform coefficients c (t) s and apply soft-thresholding with threshold τ (t) : s }) • update morphological component from the selected transform coefficients: • update threshold using an exponential decay rule: intensity recorded in this area is not diffracted from the film under study.
The performance of MCA (2) is evaluated in the case of corrupted GIXD images. In particular, the simulated missing information appears in the form of either randomly burned pixels, or whole burned rows distributed also at random across the whole image. For this purpose, two types of random binary masks are generated, namely: • Random Mask (M RM ): matrix whose elements are equal to 1, except for a subset of them being 0, whose positions are distributed uniformly at random; • Random Lines Mask (M RLM ): matrix whose elements are equal to 1, except for a subset of rows consisting of allzero elements, with the indices of these "burned" rows being selected uniformly at random among all the rows of the original image.
Under this assumption, an observed corrupted GIXD image X can be expressed as follows, depending on the corruption pattern: where X d, RM and X d, RLM are the recorded images with randomly missing pixels and whole rows, respectively. In the subsequent experiments, we generate random masks of the above two types with the number of missing pixels varying (2) Matlab code available at: http://www.greyc.ensicaen.fr/∼jfadili/ demos/WaveRestore/downloads/mcalab/Home.html in {20%, 30%, 40%, 50%} as a percentage of the total number of pixels. In practice, the case of randomly distributed burned lines appears in case of 2-D detectors based on wire chambers [40], while the case of randomly burned pixels is met in more modern acquisition systems, such as CCD cameras. Figure 3 shows corrupted instances for both images, X1 and X2, where the top row corresponds to a random mask with 50% of missing information, while the bottom row corresponds to a random lines mask, also with 50% of burned pixels. In our case, a successful inpainting of the observed GIXD images using MCA, is equivalent to recovering accurately the missing information such that the associated linecuts, estimated from the reconstructed images, to be close approximations of the linecuts corresponding to the original images. Although an amount of 40% or 50% of missing pixels is considered an extreme scenario in practice, however, we validate the efficiency of MCA even under such immoderate conditions.

Morphological Components using UDWT and Curvelets
As mentioned in Section 2.2, the morphological diversity relies on the sparsity of the distinct morphological components in specific bases. The appropriate choice of these bases, or in general the sparsifying transform domains, is highly determined by the specific structural content of a given image. For instance, the Discrete Cosine Transform (DCT) is appropriate in describing spatially homogeneous High-resolution 2 048 × 2 048 GIXD images with 50% of missing pixels masked with: a random mask with uniformly distributed burned a,b) pixels; c,d) lines.
texture, while the local ridgelet transform [38] is efficient in describing lines of fixed size. On the other hand, for images containing isotropic features and piecewise smooth regions, the UDWT has been shown to provide a very precise description [15], while anisotropic curvilinear structures have been shown to be represented optimally sparsely by the curvelet transform [16]. A visual inspection of the GIXD images used in this study, shown in Figure 2, implies that none of the above transforms alone will be capable of extracting the geometric structures of these images. In contrast, we observe that the recorded images are characterized by both piecewise smooth regions and curvilinear structures. This necessitates the use of a combination of transforms to represent GIXD images for this type of thin films in a sparse way. This first observation induces the combination of the UDWT and the curvelet transform to be a suitable choice. Moreover, we emphasize that the success of the MCA technique is based on the degree of incoherence between the distinct sparsifying transforms [26] (the s-th morphological component X s is sparsely represented in T s , while its representation in the other transform domains is not sparse). The requirement of incoherence holds for the {UDWT, curvelet transform} pair, as it is proved in [44] (ref. Lemma 3.3 and discussion in Section 3.5). This serves as a second strong motivation for the selection of this pair of transforms. Unlike the DWT, which decomposes a given image at multiple scales by downsampling the approximation and detail coefficients at each decomposition level, the UDWT does not incorporate the downsampling operations. Thus, the approximation and detail coefficients at each level have the same dimension as the original image. Besides, unlike the DWT, the UDWT is shift-invariant, while it is also more robust to ringing artifacts around singularities or edges.
On the other hand, the curvelet transform, to be used for the extraction of the anisotropic structures (curves, lines), is a special member of the family of multiscale geometric transforms. Conceptually, the curvelet transform is a multiscale pyramid with many directions (angles) and locations at each length scale, and needle-shaped elements (atoms) at fine scales. More specifically, curvelets, in addition to a variable width (w), have also a variable length (l) and so a variable anisotropy. The length and width at fine scales are related by a scaling law, w = l 2 and thus the anisotropy increases with decreasing scale like a power law.
It has been shown [6,16] that curvelets address efficiently problems where wavelets are far from ideal. For instance, they provide optimally sparse representations of objects which are characterized by smoothness except for discontinuities along a general curve with bounded curvature. Moreover, they model faithfully the geometry of wave propagation-like structures, since they may be viewed as coherent waveforms with enough frequency localization so that they behave like waves but at the same time, with enough spatial localization so that they simultaneously behave like particles. GIXD images of the kind used in this study (Fig. 2) present both characteristics, that is, curvilinear structures, as well as a wave propagation-like behavior, thus motivating the use of the curvelet transform for the analysis of the corresponding morphological component.
In the following, we test the effectiveness of MCA in decomposing the given GIXD images into their morphological components. More specifically, the parameters required by Algorithm 1 are set as follows: I max = 150, λ stop = 10 −6 , and τ (0) = λ max , where λ max is the minimum of maximal amplitude coefficients of the recorded image in each sparsifying transform domain. Regarding the UDWT, each image is analyzed in 3 scales using the "Symlet 6" (sym6) wavelet, which is near symmetric, orthogonal and biorthogonal. Moreover, the spread and the oscillating nature of the associated scaling and wavelet functions, respectively, are appropriate enough to analyze images with relatively piecewise smooth content, such as the GIXD images under study. Regarding the curvelet transform, each GIXD image is decomposed in 7 scales, where the number of angles for each scale, from the coarsest to the finest one, is equal to 1, 16, 32, 32, 64, 64, 128, respectively. For the above experimental setup and for the specific implementations, we used here, the redundancy factor (i.e., the ratio of the number of transform coefficients over the number of pixels) for the UDWT is equal to 3, while for the curvelet transform the redundancy factor is equal to 2.3. Figure 4 shows the morphological decomposition of X1 and X2 in the two distinct components. A visual inspection of both images verifies that UDWT is indeed capable of representing the isotropic, piecewise smooth regions, whereas the curvelets approximate anisotropic features, such as the arcs and the edges, which appear in both images.
As mentioned above, apart from the mutual incoherence of the (possibly over-complete redundant) transforms, the second key ingredient required for MCA to work properly, is their ability to represent the information content of a given image in a precise and compact form. This is indeed the case with the UDWT and the curvelet transform when they are applied on the given GIXD images. More precisely, for the image X1, 50% of UDWT coefficients contains the 82% of the total energy (the sum of the absolute values of the transform coefficients), while 96% of the total energy is compressed in the 25% of the most significant (of highest magnitude) curvelet coefficients. A similar behavior is observed for the image X2, where 50% of the UDWT coefficients contains the 81% of the total energy, while 25% of the curvelet coefficients corresponds to 94% of their total energy.

Approximation Accuracy of the Original Linecuts
The primary objective of this study is the MCA inpainting of corrupted GIXD images, with the goal of approximating as accurately as possible the original diffracted intensity, expressed in terms of the three linecuts introduced in Section 1.
Although MCA is designed in a sparsity-based framework, solving a sparse optimization problem in suitable transform domains, however, for completeness of presentation its performance is compared with two classical inpainting techniques, namely, the method of Fields of Experts (FoE) [45] and a Partial Differential Equations-based (PDE) approach [46]. In the former case, expressive image priors that capture the statistics of natural scenes are learned, extending traditional Markov Random Field (MRF) models by learning potential functions over extended pixel neighborhoods. In the later case, the lost information is restored guided by the anisotropic diffusion principle and the connectivity principle of human visual perception. Specifically, a fourth-order PDE model allows for the transportation of available information from the exterior towards the interior of the inpainting domain and the simultaneous diffusion of the information inside the inpainting region.
Starting with the efficiency of MCA in inpainting GIXD images corrupted by a random mask, Figure 5 shows the inpainted X1 and X2 images corresponding to the two extreme cases of our experimental setup, that is, for 20% and 50% of randomly burned pixels. As it can be seen, the performance of MCA is excellent, even for 50% of missing information, which is a rather immoderate practical scenario. Comparing the inpainted images with their original versions, shown in Figure 2, we observe that the smooth regions in X1 are preserved, while the noise-like appearance of X2 is suppressed slightly in its two inpainted counterparts. This is mainly due to an internal soft-thresholding step used by MCA, which tends to moderate the noise-like features. Figure 6 presents the corresponding linecuts (I versus q, I versus q z , I versus χ) for X1 and X2, comparing the original curves with the curves obtained from the inpainted images. In case of X1, the approximation of all three original linecuts is almost perfect, even for high percentages of corrupted pixels. The reconstruction performance for X2 is also very high, except for the extreme case at which half of the information is missing. In that case, small deviations from the original curves are apparent (for instance, in the rightmost part of the I versus χ linecut in Fig. 6b). The above observations initially derived by visual inspection of the curves in Figure 6 are verified numerically by computing the Mean Squared Relative approximation Error (MSRE) between the original and reconstructed linecuts, as follows: where l true is the original andl the corresponding reconstructed linecut. Table 2 shows the MSRE (%) for the two GIXD images, along with the associated standard deviation of the approximation errors for MCA, FoE and PDE methods. The approximation accuracy is extremely high, especially for the smoother image X1, while, as we expected, it decreases slightly as the number of missing pixels increases. Moreover, MCA clearly outperforms the other two inpainting methods in most of the cases (the minimum MSRE values are shown in bold type). In particular, the PDE approach for inpainting GIXD images of the type studied here results in the worst performance. This is not surprising, since the diffusion operation, which is inherent in every PDE-based method, tends to smooth out the interiors of the inpainted regions. This smoothing effect may detract the noise-like structures of our GIXD images, thus yielding less accurate linecuts. An additional advantage of MCA, when compared with FoE and PDE, is that, apart from restoring the lost information of a given image, it also gives the decomposition into its morphological components as a by product. As mentioned above, this decomposition can be further employed to extract structural information from GIXD data. For instance, a difference in the curvilinear structures corresponding to the two different annealing temperatures is apparent by a simple visual inspection of the curvelet components of X1 and X2 (Fig. 4).
As a second experiment, we evaluate the efficiency of MCA when whole lines of burned pixels are distributed randomly in the recorded GIXD image. Figure 7 shows the inpainted X1 and X2 images for 20% and 50% of burned pixels across randomly distributed lines. As in the previous scenario of uniformly random burned pixels, the performance of MCA is again very high, even for 50% of missing information. Comparing the inpainted images with their original counterparts in Figure 2, we can see that the noise-like appearance of X2 is suppressed in the two inpainted images. The smoothing effect of MCA for the specific combination of sparsifying transforms we employ here (UDWT, curvelets) is more prevalent in the darker area Inpainted (random mask) images a) X1, b) X2 using MCA for randomly distributed missing pixels (20%, 50%).
at the bottom of the inpainted images (Fig. 7b). However, this is not a constraint, since, as mentioned at the beginning of this section, the dark area is ignored during the computation of the linecuts. Figure 8 shows the corresponding linecuts for X1 and X2, comparing the original curves with the curves obtained from the inpainted images. As an overall conclusion, we can say that the approximation is highly accurate, especially for the first two linecuts (I versus q and I versus q z ), while it diminishes slightly as the percentage of missing pixels increases. We observe that the I versus χ linecut presents the highest sensitivity with respect to the amount of missing information, which is more apparent near the peak and in the tails of the intensity curves for both images. In fact, the peak corresponds to the integrated intensity of the pixels around column 1 680 between rows 630-670, while the tail corresponds to the integrated intensity around row 440 between columns 1 440-1 460 (Fig. 2). A close inspection of the randomly masked images shown in Figures 3c,d reveals that there is a large amount of missing pixels, effectively greater than 50%, concentrated in the corresponding areas. This inhibits the inpainting of the images in these specific areas, resulting in an increased approximation error.
The above observations are consistent with the entries of Table 3 Linecuts of the original and inpainted (using MCA) GIXD images a) X1, b) X2 for randomly distributed missing pixels.  Inpainted (random lines mask) images a) X1, b) X2 using MCA for randomly distributed lines of missing pixels (20%, 50%).
values for the two GIXD images, along with the associated standard deviation of the approximation errors, for the MCA, FoE and PDE inpainting methods. As in the case of randomly burned pixels, the approximation accuracy is extremely high, especially for the smoother image X1 and for the first two linecuts, while, as we expected, it decreases slightly as the number of missing pixels increases. Moreover, we can see that MCA achieves a more accurate reconstruction of the linecuts in most of the cases, when compared with FoE and PDE, while its performance is very close to FoE and PDE for the cases where they resulted in a lower MSRE. Finally, by comparing the corresponding entries in Tables 2 and 3, we conclude that inpainting GIXD images in case of randomly missing lines is more demanding than inpainting when randomly distributed pixels are missing. Focusing on MCA, as we suggest in the next section, this reduction in reconstruction accuracy can be alleviated by incorporating additional sparsifying transforms, such as the ridgelet transform, which is more efficient in extracting meaningful information from the neighboring pixels across straight lines and edges. This is another important advantage of MCA, that is, the ability to improve the inpainting accuracy by incorporating more efficient sparsifying transformations, which are able to extract additional structural components. On the other hand, such an improvement is impossible with methods like FoE or PDE. Linecuts of the original and inpainted (using MCA) GIXD images a) X1, b) X2a for randomly distributed lines of missing pixels.

TABLE 3
Mean Squared Relative approximation Error (MSRE) (%) between the original and reconstructed linecuts, using MCA, FoE and PDE inpainting methods, for images X1 a), X2 b) with missing pixels across randomly distributed lines (the standard deviation of the error is shown in parentheses)

CONCLUSIONS AND FUTURE WORK
In this study, we exploited the efficiency of the recently introduced MCA algorithm for the decomposition of images in distinct morphological components, based on the achieved sparsity in appropriate over-complete redundant transform domains, with our objective being to solve the problem of recovering the missing information in corrupted GIXD images due to potential malfunction of the detectors. The experimental evaluation using high-resolution GIXD images of thin polythiophene-based films, showed that the proposed approach is highly efficient in recovering the missing information in the form of either randomly burned pixels, or whole burned rows, even at the order of 50% of the total number of pixels. This led to the derivation of accurate intensity 1-D plots (linecuts) from the recovered inpainted images, that will later allow for correct data interpretation. This result can be of high impact in scattering and imaging techniques applied for materials characterization, since it indicates that the proposed MCA-based inpainting approach weakens the necessity for long acquisition times or repeating experiments, especially in synchrotron-based experiments, related to inferior detector performance during the acquisition process.
In the present work, we employed the UDWT and the curvelet transform as the most appropriate domains for the representation of the isotropic, as well as the anisotropic features, which appear in GIXD data. However, the experimental evaluation showed a smoothing effect when a GIXD image presents noise-like features, which may decrease the estimation accuracy of the corresponding linecuts. We expect that the inclusion of additional transforms to capture more features, such as the noiselet, the ridgelet or the recently introduced shearlet transform [47], which best suit noise-like features and edges, respectively, will improve the inpainting performance and subsequently the more accurate extraction of the linecuts that are of interest to us. Moreover, in all these cases, the sparsifying transforms are fixed. However, recent works have shown that, instead of deploying a predefined set of transforms, the use of learned sparsifying dictionaries [7,8], which adapt to the inherent image structures, often result in a superior reconstruction performance. Thus, the extension of the MCA inpainting framework in a joint dictionary learning and reconstruction framework would be also of great importance.