A general class of powerful and flexible modeling techniques, spline smoothing has attracted a great deal of research attention in recent years and has been widely used in many application areas, from medicine to economics. Smoothing Splines: Methods and Applications covers basic smoothing spline models, including polynomial, periodic, spherical, thin-plate, L-, and partial splines, as well as more advanced models, such as smoothing spline ANOVA, extended and generalized smoothing spline ANOVA, vector spline, nonparametric nonlinear regression, semiparametric regression, and semiparametric mixed-effects models. It also presents methods for model selection and inference. The book provides unified frameworks for estimation, inference, and software implementation by using the general forms of nonparametric/semiparametric, linear/nonlinear, and fixed/mixed smoothing spline models. The theory of reproducing kernel Hilbert space (RKHS) is used to present various smoothing spline models in a unified fashion. Although this approach can be technical and difficult, the author makes the advanced smoothing spline methodology based on RKHS accessible to practitioners and students. He offers a gentle introduction to RKHS, keeps theory at a minimum level, and explains how RKHS can be used to construct spline models. Smoothing Splines offers a balanced mix of methodology, computation, implementation, software, and applications. It uses R to perform all data analyses and includes a host of real data examples from astronomy, economics, medicine, and meteorology. The codes for all examples, along with related developments, can be found on the book’s web page.
In recent years, there has been a great deal of interest and activity in the general area of nonparametric smoothing in statistics. This monograph concentrates on the roughness penalty method and shows how this technique provides a unifying approach to a wide range of smoothing problems. The method allows parametric assumptions to be realized in regression problems, in those approached by generalized linear modelling, and in many other contexts. The emphasis throughout is methodological rather than theoretical, and it concentrates on statistical and computation issues. Real data examples are used to illustrate the various methods and to compare them with standard parametric approaches. Some publicly available software is also discussed. The mathematical treatment is self-contained and depends mainly on simple linear algebra and calculus. This monograph will be useful both as a reference work for research and applied statisticians and as a text for graduate students and other encountering the material for the first time.
Data-analytic approaches to regression problems, arising from many scientific disciplines are described in this book. The aim of these nonparametric methods is to relax assumptions on the form of a regression function and to let data search for a suitable function that describes the data well. The use of these nonparametric functions with parametric techniques can yield very powerful data analysis tools. Local polynomial modeling and its applications provides an up-to-date picture on state-of-the-art nonparametric regression techniques. The emphasis of the book is on methodologies rather than on theory, with a particular focus on applications of nonparametric techniques to various statistical problems. High-dimensional data-analytic tools are presented, and the book includes a variety of examples. This will be a valuable reference for research and applied statisticians, and will serve as a textbook for graduate students and others interested in nonparametric regression.
Smoothing methods are an active area of research. In this book, the author presents a comprehensive treatment of penalty smoothing under a unified framework. Methods are developed for (i) regression with Gaussian and non-Gaussian responses as well as with censored life time data; (ii) density and conditional density estimation under a variety of sampling schemes; and (iii) hazard rate estimation with censored life time data and covariates. Extensive discussions are devoted to model construction, smoothing parameter selection, computation, and asymptotic convergence. Most of the computational and data analytical tools discussed in the book are implemented in R, an open-source clone of the popular S/S- PLUS language.
This book presents recent science and engineering research in the field of conventional and renewable energy, energy efficiency and optimization, discussing problems such as availability, peak load and reliability of sustainable supply for power to consumers. Such research is imperative since efficient and environmentally friendly solutions are critical in modern electricity production and transmission.
The first edition of this book has established itself as one of the leading references on generalized additive models (GAMs), and the only book on the topic to be introductory in nature with a wealth of practical examples and software implementation. It is self-contained, providing the necessary background in linear models, linear mixed models, and generalized linear models (GLMs), before presenting a balanced treatment of the theory and applications of GAMs and related models. The author bases his approach on a framework of penalized regression splines, and while firmly focused on the practical aspects of GAMs, discussions include fairly full explanations of the theory underlying the methods. Use of R software helps explain the theory and illustrates the practical application of the methodology. Each chapter contains an extensive set of exercises, with solutions in an appendix or in the book’s R data package gamair, to enable use as a course text or for self-study. Simon N. Wood is a professor of Statistical Science at the University of Bristol, UK, and author of the R package mgcv.
The success of the first edition of Generalized Linear Models led to the updated Second Edition, which continues to provide a definitive unified, treatment of methods for the analysis of diverse types of data. Today, it remains popular for its clarity, richness of content and direct relevance to agricultural, biological, health, engineering, and other applications. The authors focus on examining the way a response variable depends on a combination of explanatory variables, treatment, and classification variables. They give particular emphasis to the important case where the dependence occurs through some unknown, linear combination of the explanatory variables. The Second Edition includes topics added to the core of the first edition, including conditional and marginal likelihood methods, estimating equations, and models for dispersion effects and components of dispersion. The discussion of other topics-log-linear and related models, log odds-ratio regression models, multinomial response models, inverse linear and related models, quasi-likelihood functions, and model checking-was expanded and incorporates significant revisions. Comprehension of the material requires simply a knowledge of matrix theory and the basic ideas of probability theory, but for the most part, the book is self-contained. Therefore, with its worked examples, plentiful exercises, and topics of direct use to researchers in many disciplines, Generalized Linear Models serves as ideal text, self-study guide, and reference.
This book describes an array of power tools for data analysis that are based on nonparametric regression and smoothing techniques. These methods relax the linear assumption of many standard models and allow analysts to uncover structure in the data that might otherwise have been missed. While McCullagh and Nelder's Generalized Linear Models shows how to extend the usual linear methodology to cover analysis of a range of data types, Generalized Additive Models enhances this methodology even further by incorporating the flexibility of nonparametric regression. Clear prose, exercises in each chapter, and case studies enhance this popular text.
Semialgebraic Statistics and Latent Tree Models explains how to analyze statistical models with hidden (latent) variables. It takes a systematic, geometric approach to studying the semialgebraic structure of latent tree models. The first part of the book gives a general introduction to key concepts in algebraic statistics, focusing on methods that are helpful in the study of models with hidden variables. The author uses tensor geometry as a natural language to deal with multivariate probability distributions, develops new combinatorial tools to study models with hidden data, and describes the semialgebraic structure of statistical models. The second part illustrates important examples of tree models with hidden variables. The book discusses the underlying models and related combinatorial concepts of phylogenetic trees as well as the local and global geometry of latent tree models. It also extends previous results to Gaussian latent tree models. This book shows you how both combinatorics and algebraic geometry enable a better understanding of latent tree models. It contains many results on the geometry of the models, including a detailed analysis of identifiability and the defining polynomial constraints.
This classic work continues to offer a comprehensive treatment of the theory of univariate and tensor-product splines. It will be of interest to researchers and students working in applied analysis, numerical analysis, computer science, and engineering. The material covered provides the reader with the necessary tools for understanding the many applications of splines in such diverse areas as approximation theory, computer-aided geometric design, curve and surface design and fitting, image processing, numerical solution of differential equations, and increasingly in business and the biosciences. This new edition includes a supplement outlining some of the major advances in the theory since 1981, and some 250 new references. It can be used as the main or supplementary text for courses in splines, approximation theory or numerical analysis.
This book is about learning from data using the Generalized Additive Models for Location, Scale and Shape (GAMLSS). GAMLSS extends the Generalized Linear Models (GLMs) and Generalized Additive Models (GAMs) to accommodate large complex datasets, which are increasingly prevalent. GAMLSS allows any parametric distribution for the response variable and modelling all the parameters (location, scale and shape) of the distribution as linear or smooth functions of explanatory variables. This book provides a broad overview of GAMLSS methodology and how it is implemented in R. It includes a comprehensive collection of real data examples, integrated code, and figures to illustrate the methods, and is supplemented by a website with code, data and additional materials.
Although there has been a surge of interest in density estimation in recent years, much of the published research has been concerned with purely technical matters with insufficient emphasis given to the technique's practical value. Furthermore, the subject has been rather inaccessible to the general statistician. The account presented in this book places emphasis on topics of methodological importance, in the hope that this will facilitate broader practical application of density estimation and also encourage research into relevant theoretical work. The book also provides an introduction to the subject for those with general interests in statistics. The important role of density estimation as a graphical technique is reflected by the inclusion of more than 50 graphs and figures throughout the text. Several contexts in which density estimation can be used are discussed, including the exploration and presentation of data, nonparametric discriminant analysis, cluster analysis, simulation and the bootstrap, bump hunting, projection pursuit, and the estimation of hazard rates and other quantities that depend on the density. This book includes general survey of methods available for density estimation. The Kernel method, both for univariate and multivariate data, is discussed in detail, with particular emphasis on ways of deciding how much to smooth and on computation aspects. Attention is also given to adaptive methods, which smooth to a greater degree in the tails of the distribution, and to methods based on the idea of penalized likelihood.
Discover New Methods for Dealing with High-Dimensional Data A sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data. Top experts in this rapidly evolving field, the authors describe the lasso for linear regression and a simple coordinate descent algorithm for its computation. They discuss the application of l1 penalties to generalized linear models and support vector machines, cover generalized penalties such as the elastic net and group lasso, and review numerical methods for optimization. They also present statistical inference methods for fitted (lasso) models, including the bootstrap, Bayesian methods, and recently developed approaches. In addition, the book examines matrix decomposition, sparse multivariate analysis, graphical models, and compressed sensing. It concludes with a survey of theoretical results for the lasso. In this age of big data, the number of features measured on a person or object can be large and might be larger than the number of observations. This book shows how the sparsity assumption allows us to tackle these problems and extract useful and reproducible patterns from big datasets. Data analysts, computer scientists, and theorists will appreciate this thorough and up-to-date treatment of sparse statistical modeling.
This user-friendly 2003 book explains the techniques and benefits of semiparametric regression in a concise and modular fashion.
This book serves well as an introduction into the more theoretical aspects of the use of spline models. It develops a theory and practice for the estimation of functions from noisy data on functionals. The simplest example is the estimation of a smooth curve, given noisy observations on a finite number of its values. Convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework. Methods for including side conditions and other prior information in solving ill posed inverse problems are provided. Data which involves samples of random variables with Gaussian, Poisson, binomial, and other distributions are treated in a unified optimization context. Experimental design questions, i.e., which functionals should be observed, are studied in a general context. Extensions to distributed parameter system identification problems are made by considering implicitly defined functionals.
Despite research interest in functional data analysis in the last three decades, few books are available on the subject. Filling this gap, Analysis of Variance for Functional Data presents up-to-date hypothesis testing methods for functional data analysis. The book covers the reconstruction of functional observations, functional ANOVA, functional linear models with functional responses, ill-conditioned functional linear models, diagnostics of functional observations, heteroscedastic ANOVA for functional data, and testing equality of covariance functions. Although the methodologies presented are designed for curve data, they can be extended to surface data. Useful for statistical researchers and practitioners analyzing functional data, this self-contained book gives both a theoretical and applied treatment of functional data analysis supported by easy-to-use MATLAB® code. The author provides a number of simple methods for functional hypothesis testing. He discusses pointwise, L2-norm-based, F-type, and bootstrap tests. Assuming only basic knowledge of statistics, calculus, and matrix algebra, the book explains the key ideas at a relatively low technical level using real data examples. Each chapter also includes bibliographical notes and exercises. Real functional data sets from the text and MATLAB codes for analyzing the data examples are available for download from the author’s website.
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint models, and incomplete data. Each of these sections begins with an introductory chapter that provides useful background material and a broad outline to set the stage for subsequent chapters. Rather than focus on a narrowly defined topic, chapters integrate important research discussions from the statistical literature. They seamlessly blend theory with applications and include examples and case studies from various disciplines. Destined to become a landmark publication in the field, this carefully edited collection emphasizes statistical models and methods likely to endure in the future. Whether involved in the development of statistical methodology or the analysis of longitudinal data, readers will gain new perspectives on the field.
Incorporates mixed-effects modeling techniques for more powerful and efficient methods This book presents current and effective nonparametric regression techniques for longitudinal data analysis and systematically investigates the incorporation of mixed-effects modeling techniques into various nonparametric regression models. The authors emphasize modeling ideas and inference methodologies, although some theoretical results for the justification of the proposed methods are presented. With its logical structure and organization, beginning with basic principles, the text develops the foundation needed to master advanced principles and applications. Following a brief overview, data examples from biomedical research studies are presented and point to the need for nonparametric regression analysis approaches. Next, the authors review mixed-effects models and nonparametric regression models, which are the two key building blocks of the proposed modeling techniques. The core section of the book consists of four chapters dedicated to the major nonparametric regression methods: local polynomial, regression spline, smoothing spline, and penalized spline. The next two chapters extend these modeling techniques to semiparametric and time varying coefficient models for longitudinal data analysis. The final chapter examines discrete longitudinal data modeling and analysis. Each chapter concludes with a summary that highlights key points and also provides bibliographic notes that point to additional sources for further study. Examples of data analysis from biomedical research are used to illustrate the methodologies contained throughout the book. Technical proofs are presented in separate appendices. With its focus on solving problems, this is an excellent textbook for upper-level undergraduate and graduate courses in longitudinal data analysis. It is also recommended as a reference for biostatisticians and other theoretical and applied research statisticians with an interest in longitudinal data analysis. Not only do readers gain an understanding of the principles of various nonparametric regression methods, but they also gain a practical understanding of how to use the methods to tackle real-world problems.
Since their introduction in 1972, generalized linear models (GLMs) have proven useful in the generalization of classical normal models. Presenting methods for fitting GLMs with random effects to data, Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood explores a wide range of applications, including combining information over trials (meta-analysis), analysis of frailty models for survival data, genetic epidemiology, and analysis of spatial and temporal models with correlated errors. Written by pioneering authorities in the field, this reference provides an introduction to various theories and examines likelihood inference and GLMs. The authors show how to extend the class of GLMs while retaining as much simplicity as possible. By maximizing and deriving other quantities from h-likelihood, they also demonstrate how to use a single algorithm for all members of the class, resulting in a faster algorithm as compared to existing alternatives. Complementing theory with examples, many of which can be run by using the code supplied on the accompanying CD, this book is beneficial to statisticians and researchers involved in the above applications as well as quality-improvement experiments and missing-data analysis.
Diese f?r Studierende ebenso wie f?r Wissenschaftler, Ingenieure und Praktiker geeignete Einf?hrung in mathematische Modellbildung und Simulation setzt nur einfache Grundkenntnisse in Analysis und linearer Algebra voraus - alle weiteren Konzepte werden im Buch entwickelt. Die Leserinnen und Leser lernen anhand detailliert besprochener Beispiele aus unterschiedlichsten Bereichen (Biologie, ?kologie, ?konomie, Medizin, Landwirtschaft, Chemie, Maschinenbau, Elektrotechnik, Prozesstechnik usw.), sich kritisch mit mathematischen Modellen auseinanderzusetzen und anspruchsvolle mathematische Modelle selbst zu formulieren und zu implementieren. Das Themenspektrum reicht von statistischen Modellen bis zur Mehrphasen-Str?mungsdynamik in 3D. F?r alle im Buch besprochenen Modellklassen wird kostenlose Open-Source-Software zur Verf?gung gestellt. Grundlage ist das eigens f?r dieses Buch entwickelte Betriebssystem Gm.Linux ("Geisenheim-Linux"), das ohne Installationsaufwand z.B. auch auf Windows-Rechnern l?uft. Ein Referenzkartensystem zu Gm.Linux mit einfachen Schritt-f?r-Schritt-Anleitungen erm?glicht es, auch komplexe statistische Berechnungen oder 3D-Str?mungssimulationen in kurzer Zeit zu realisieren. Alle im Buch beschriebenen Verfahren beziehen sich auf Gm.Linux 2.0 (und die darin fixierten Versionen aller Anwendungsprogramme) und sind daher unabh?ngig von Softwareaktualisierungen langfristig verwendbar.