## 62-XX STATISTICS

### Filtern

#### Fachbereich / Organisatorische Einheit

#### Dokumenttyp

#### Schlagworte

- Autoregressive Hilbertian model (1)
- Bootstrap (1)
- Censoring (1)
- Change Point Analysis (1)
- Change Point Test (1)
- Change-point Analysis (1)
- Change-point estimator (1)
- Change-point test (1)
- Functional autoregression (1)
- Functional time series (1)

- An Iterative Plug-in Algorithm for Optimal Bandwidth Selection in Kernel Intensity Estimation for Spatial Data (2018)
- A popular model for the locations of fibres or grains in composite materials is the inhomogeneous Poisson process in dimension 3. Its local intensity function may be estimated non-parametrically by local smoothing, e.g. by kernel estimates. They crucially depend on the choice of bandwidths as tuning parameters controlling the smoothness of the resulting function estimate. In this thesis, we propose a fast algorithm for learning suitable global and local bandwidths from the data. It is well-known, that intensity estimation is closely related to probability density estimation. As a by-product of our study, we show that the difference is asymptotically negligible regarding the choice of good bandwidths, and, hence, we focus on density estimation. There are quite a number of data-driven bandwidth selection methods for kernel density estimates. cross-validation is a popular one and frequently proposed to estimate the optimal bandwidth. However, if the sample size is very large, it becomes computational expensive. In material science, in particular, it is very common to have several thousand up to several million points. Another type of bandwidth selection is a solve-the-equation plug-in approach which involves replacing the unknown quantities in the asymptotically optimal bandwidth formula by their estimates. In this thesis, we develop such an iterative fast plug-in algorithm for estimating the optimal global and local bandwidth for density and intensity estimation with a focus on 2- and 3-dimensional data. It is based on a detailed asymptotics of the estimators of the intensity function and of its second derivatives and integrals of second derivatives which appear in the formulae for asymptotically optimal bandwidths. These asymptotics are utilised to determine the exact number of iteration steps and some tuning parameters. For both global and local case, fewer than 10 iterations suffice. Simulation studies show that the estimated intensity by local bandwidth can better indicate the variation of local intensity than that by global bandwidth. Finally, the algorithm is applied to two real data sets from test bodies of fibre-reinforced high-performance concrete, clearly showing some inhomogeneity of the fibre intensity.

- On Changepoint Detection in a Series of Stimulus-Response Data (2018)
- In this paper, we demonstrate the power of functional data models for a statistical analysis of stimulus-response experiments which is a quite natural way to look at this kind of data and which makes use of the full information available. In particular, we focus on the detection of a change in the mean of the response in a series of stimulus-response curves where we also take into account dependence in time.

- Nonparametric Tests for Change Points in Hazard Functions under Random Censorship in Survival Analysis (2017)
- The thesis studies change points in absolute time for censored survival data with some contributions to the more common analysis of change points with respect to survival time. We first introduce the notions and estimates of survival analysis, in particular the hazard function and censoring mechanisms. Then, we discuss change point models for survival data. In the literature, usually change points with respect to survival time are studied. Typical examples are piecewise constant and piecewise linear hazard functions. For that kind of models, we propose a new algorithm for numerical calculation of maximum likelihood estimates based on a cross entropy approach which in our simulations outperforms the common Nelder-Mead algorithm. Our original motivation was the study of censored survival data (e.g., after diagnosis of breast cancer) over several decades. We wanted to investigate if the hazard functions differ between various time periods due, e.g., to progress in cancer treatment. This is a change point problem in the spirit of classical change point analysis. Horváth (1998) proposed a suitable change point test based on estimates of the cumulative hazard function. As an alternative, we propose similar tests based on nonparametric estimates of the hazard function. For one class of tests related to kernel probability density estimates, we develop fully the asymptotic theory for the change point tests. For the other class of estimates, which are versions of the Watson-Leadbetter estimate with censoring taken into account and which are related to the Nelson-Aalen estimate, we discuss some steps towards developing the full asymptotic theory. We close by applying the change point tests to simulated and real data, in particular to the breast cancer survival data from the SEER study.

- Asymptotics for change-point tests and change-point estimators (2017)
- In change-point analysis the point of interest is to decide if the observations follow one model or if there is at least one time-point, where the model has changed. This results in two sub- fields, the testing of a change and the estimation of the time of change. This thesis considers both parts but with the restriction of testing and estimating for at most one change-point. A well known example is based on independent observations having one change in the mean. Based on the likelihood ratio test a test statistic with an asymptotic Gumbel distribution was derived for this model. As it is a well-known fact that the corresponding convergence rate is very slow, modifications of the test using a weight function were considered. Those tests have a better performance. We focus on this class of test statistics. The first part gives a detailed introduction to the techniques for analysing test statistics and estimators. Therefore we consider the multivariate mean change model and focus on the effects of the weight function. In the case of change-point estimators we can distinguish between the assumption of a fixed size of change (fixed alternative) and the assumption that the size of the change is converging to 0 (local alternative). Especially, the fixed case in rarely analysed in the literature. We show how to come from the proof for the fixed alternative to the proof of the local alternative. Finally, we give a simulation study for heavy tailed multivariate observations. The main part of this thesis focuses on two points. First, analysing test statistics and, secondly, analysing the corresponding change-point estimators. In both cases, we first consider a change in the mean for independent observations but relaxing the moment condition. Based on a robust estimator for the mean, we derive a new type of change-point test having a randomized weight function. Secondly, we analyse non-linear autoregressive models with unknown regression function. Based on neural networks, test statistics and estimators are derived for correctly specified as well as for misspecified situations. This part extends the literature as we analyse test statistics and estimators not only based on the sample residuals. In both sections, the section on tests and the one on the change-point estimator, we end with giving regularity conditions on the model as well as the parameter estimator. Finally, a simulation study for the case of the neural network based test and estimator is given. We discuss the behaviour under correct and mis-specification and apply the neural network based test and estimator on two data sets.

- The Bootstrap for the Functional Autoregressive Model FAR(1) (2016)
- Functional data analysis is a branch of statistics that deals with observations \(X_1,..., X_n\) which are curves. We are interested in particular in time series of dependent curves and, specifically, consider the functional autoregressive process of order one (FAR(1)), which is defined as \(X_{n+1}=\Psi(X_{n})+\epsilon_{n+1}\) with independent innovations \(\epsilon_t\). Estimates \(\hat{\Psi}\) for the autoregressive operator \(\Psi\) have been investigated a lot during the last two decades, and their asymptotic properties are well understood. Particularly difficult and different from scalar- or vector-valued autoregressions are the weak convergence properties which also form the basis of the bootstrap theory. Although the asymptotics for \(\hat{\Psi}{(X_{n})}\) are still tractable, they are only useful for large enough samples. In applications, however, frequently only small samples of data are available such that an alternative method for approximating the distribution of \(\hat{\Psi}{(X_{n})}\) is welcome. As a motivation, we discuss a real-data example where we investigate a changepoint detection problem for a stimulus response dataset obtained from the animal physiology group at the Technical University of Kaiserslautern. To get an alternative for asymptotic approximations, we employ the naive or residual-based bootstrap procedure. In this thesis, we prove theoretically and show via simulations that the bootstrap provides asymptotically valid and practically useful approximations of the distributions of certain functions of the data. Such results may be used to calculate approximate confidence bands or critical bounds for tests.

- Image based characterization and geometric modeling of 3d materials microstructures (2015)
- It is well known that the structure at a microscopic point of view strongly influences the macroscopic properties of materials. Moreover, the advancement in imaging technologies allows to capture the complexity of the structures at always decreasing scales. Therefore, more sophisticated image analysis techniques are needed. This thesis provides tools to geometrically characterize different types of three-dimensional structures with applications to industrial production and to materials science. Our goal is to enhance methods that allow the extraction of geometric features from images and the automatic processing of the information. In particular, we investigate which characteristics are sufficient and necessary to infer the desired information, such as particles classification for technical cleanliness and fitting of stochastic models in materials science. In the production line of automotive industry, dirt particles collect on the surface of mechanical components. Residual dirt might reduce the performance and durability of assembled products. Geometric characterization of these particles allows to identify their potential danger. While the current standards are based on 2d microscopic images, we extend the characterization to 3d. In particular, we provide a collection of parameters that exhaustively describe size and shape of three-dimensional objects and can be efficiently estimated from binary images. Furthermore, we show that only a few features are sufficient to classify particles according to the standards of technical cleanliness. In the context of materials science, we consider two types of microstructures: fiber systems and foams. Stochastic geometry grants the fundamentals for versatile models able to encompass the geometry observed in the samples. To allow automatic model fitting, we need rules stating which parameters of the model yield the best-fitting characteristics. However, the validity of such rules strongly depends on the properties of the structures and on the choice of the model. For instance, isotropic orientation distribution yields the best theoretical results for Boolean models and Poisson processes of cylinders with circular cross sections. Nevertheless, fiber systems in composites are often anisotropic. Starting from analytical results from the literature, we derive formulae for anisotropic Poisson processes of cylinders with polygonal cross sections that can be directly used in applications. We apply this procedure to a sample of medium density fiber board. Even if image resolution does not allow to estimate reliably characteristics of the singles fibers, we can fit Boolean models and Poisson cylinder processes. In particular, we show the complete model fitting and validation procedure with cylinders with circular and squared cross sections. Different problems arise when modeling cellular materials. Motivated by the physics of foams, random Laguerre tessellations are a good choice to model the pore system of foams. Considering tessellations generated by systems of non-overlapping spheres allows to control the cell size distribution, but yields the loss of an analytical description of the model. Nevertheless, automatic model fitting can still be obtained by approximating the characteristics of the tessellation depending on the parameters of the model. We investigate how to improve the choice of the model parameters. Angles between facets and between edges were never considered so far. We show that the distributions of angles in Laguerre tessellations depend on the model parameters. Thus, including the moments of the angles still allows automatic model fitting. Moreover, we propose an algorithm to estimate angles from images of real foams. We observe that angles are matched well in random Laguerre tessellations also when they are not employed to choose the model parameters. Then, we concentrate on the edge length distribution. In Laguerre tessellations occur many more short edges than in real foams. To deal with this problem, we consider relaxed models. Relaxation refers to topological and structural modifications of a tessellation in order to make it comply with Plateau's laws of mechanical equilibrium. We inspect samples of different types of foams, closed and open cell foams, polymeric and metallic. By comparing the geometric characteristics of the model and of the relaxed tessellations, we conclude that whether the relaxation improves the edge length distribution strongly depends on the type of foam.

- Some Steps towards Experimental Design for Neural Network Regression (2011)
- We discuss some first steps towards experimental design for neural network regression which, at present, is too complex to treat fully in general. We encounter two difficulties: the nonlinearity of the models together with the high parameter dimension on one hand, and the common misspecification of the models on the other hand. Regarding the first problem, we restrict our consideration to neural networks with only one and two neurons in the hidden layer and a univariate input variable. We prove some results regarding locally D-optimal designs, and present a numerical study using the concept of maximin optimal designs. In respect of the second problem, we have a look at the effects of misspecification on optimal experimental designs.