Mortality in renal transplant recipients given erythropoietins to increase haemoglobin concentration: cohort study
Objective: To determine the optimal range of increase in haemoglobin concentration with treatment with erythropoietins that is safe and is not associated with mortality.
Design: Retrospective cohort study. The analysis was adjusted for several covariables with Cox regression analysis with spline functions. Use of erythropoietins, haemoglobin concentration, and covariables were included in a time varying manner; variable selection was based on the purposeful selection algorithm.
Setting: Transplantation centres in Austria.
Participants: 1794 renal transplant recipients recorded in the Austrian Dialysis and Transplant Registry who received a transplant between 1 January 1992 and 31 December 2004 and survived at least three months.
Main outcome measures: Survival time and haemoglobin concentration after treatment with erythropoietins.
Results: The prevalence of use of erythropoietins has increased over the past 15 years to 25%. Unadjusted extended Kaplan-Meier analysis suggests higher mortality in patients treated with erythropoietins, in whom 10 year survival was 57% compared with 78% in those not treated with erythropoietins (P<0.001). In the treated patients there were 5.4 events/100 person years, compared with 2.6 events/100 person years in those not treated (P<0.001). After adjustment for confounding by indication, comorbidities, comedication, and laboratory readings, haemoglobin concentrations >125 g/l were associated with increased mortality in treated patients (hazard ratio 2.8 (95% confidence interval 1.0 to 7.9) for haemoglobin concentration 140 g/l v 125 g/l), but not in those not treated (0.7, 0.4 to 1.5). When haemoglobin concentrations were 147 g/l or above, patients treated with erythropoietins showed significantly higher mortality than those who were not treated (3.0, 1.0 to 9.4).
Conclusion: Increasing haemoglobin concentrations to above 125 g/l with erythropoietins in renal transplant recipients is associated with an increase in mortality. This increase was significant at concentrations above 140 g/l.
Calcineurin Inhibitor-Based Immunosuppressive Therapy, Donor Age, and Long-Term Outcome After Kidney Transplantation
Background. It is unclear whether the choice of maintenance immunosuppression modulates the negative effect of advanced donor age on outcome after renal transplantation.
Methods. All 1829 patients who received their first transplant between 1990 and 2003 at the Vienna Medical Centre and had a functioning graft after 90 days were studied. At this time point, 1587 received calcineurin inhibitors (CNI+), 242 did not (CNI-). Actual and functional graft survival was analyzed in subgroups based on donor age (<36, 36-49, 50-64, and >64 years) and immunosuppressive therapy.
Results. The median follow-up time was 7 years. In total, we observed 312 deaths and 275 graft losses. After adjusting for several variables considered as potential confounders, actual graft survival was better in CNI+ patients compared with CNI- patients only if donor age was less than 36 years (adjusted hazard ratio 0.25, 95% confidence interval 0.17-0.38) or 36 to 49 years (0.43, 95% confidence interval 0.29-0.62). Similar results were obtained for functional graft survival. Patient survival was significantly better in CNI+ subjects irrespective of donor age (0.41, 95% confidence interval 0.30-0.57).
Discussion. Use of CNI 90 days after transplantation is associated with improved patient survival even after adjustment for confounders, but its beneficial association with actual and functional graft survival is lost or at least reduced if kidneys from donors older than 50 years are used.
A weighted Cox model for modelling time-dependent exposures in the analysis of case–control studies
Many exposures investigated in epidemiological case–control studies may vary over time. The effects of these exposures are usually estimated using logistic regression, which does not directly account for changes in covariate values over time within individuals. By contrast, the Cox model with time-dependent covariates directly accounts for these changes over time. However, the over-sampling of cases in case–control studies, relative to controls, requires manipulating the risk sets in the Cox partial likelihood. A previous study showed that simple inclusion or exclusion of future cases in each risk set induces an under- or over-estimation bias in the regression parameters, respectively. We investigate the performance of a weighted Cox model that weights subjects according to age-conditional probabilities of developing the disease of interest in the source population. In a simulation study, the lifetime experience of a source population is first generated and a case–control study is then simulated within each population. Different characteristics of exposure are generated, including time-varying intensity. The results show that the estimates from the weighted Cox model are much less biased than the Cox models that simply include or exclude future cases, and are superior to logistic regression estimates in terms of bias and mean-squared error. An application to frequency-matched population-based case–control data on lung cancer illustrates similar differences in the estimated effects of different smoking variables. The investigated weighted Cox model is a potential alternative method to analyse matched or unmatched population-based case–control studies with time-dependent exposures.
An overview of the objectives of and the approaches to propensity score analyses
The assessment of treatment effects from observational studies may be biased with patients not randomly allocated to experimental or control group. One way to overcome this conceptual shortcoming in the design of such studies is the use of propensity scores to adjust for differences of the characteristics between patients treated with experimental and control interventions. The propensity score is defined as the probability that a patient received the experimental intervention conditional on pre-treatment characteristics at baseline. Here we review how propensity scores are estimated and how they can help in adjusting the treatment effect for baseline imbalances. We further discuss how to evaluate adequate overlap of baseline characteristics between patient groups, provide guidelines for variable selection and model building in modeling the propensity score, and review different methods of propensity score adjustments. We conclude that propensity analyses may help in evaluating the comparability of patients in observational studies, and may account for more potential confounding factors than conventional covariate adjustment approaches. However, bias due to unmeasured confounding cannot be corrected for.
Download of the pre-peer reviewed version of the article
Combining difference and equivalence test results in spatial maps
Background: Regionally partitioned health indicator values are commonly presented in choropleth maps. Policymakers and health authorities use them among others for health reporting, demand planning and quality assessment. Quite often there are concerns whether the health situation in certain areas can be considered different or equivalent to a reference value.
Results: Highlighting statistically significant areas enables the statement that these areas differ from the reference value. However, this approach does not allow conclusions which areas are sufficiently close to the reference value, although these are crucial for health policy making as well. In order to overcome this weakness a combined integration of statistical difference and equivalence tests into choropleth maps is suggested and the approach is exemplified with health data of Austrian newborns.
Conclusions: The suggested method will improve the interpretability of choropleth maps for policymakers and health authorities.
Often, interesting candidate tumor markers are not only genes that show homogenously higher expression (HHE)
in tumor samples compared to control samples, but also genes with only predominantly higher expression (PHE),
i.e. genes which exhibit higher expression in at least 80% of tumor samples. Standard parametric test statistics
used in the analysis of microarray experiments may fail with PHE as a consequence of the mixture of distributions
present in the tumor group. As alternative we consider trimmed t-statistics which compare group means after
removing outliers in each group. The trimming proportion can be chosen adaptively, either based on a boxplot
outlier detection rule or by optimization over a series of tests with varying trimming proportions. The trimmed
t-statistics can be plugged into the ‘significance analysis of microarrays’ (SAM) procedure, yielding the modified
boxplot rule test (modBox) and the modified optimization test (modOpt), respectively. By means of simulation
of microarray experiments, we show that modOpt is superior to contenders in detecting PHE, while there is
only little loss in efficiency under HHE compared to SAM. Analysis of a real microarray experiment revealed
that, out of nearly 29,000 genes, about 417 genes exhibiting PHE are detected by modOpt but missed by SAM.
Download of the pre-peer reviewed version of the article.
Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets
Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified
into several subsets, e. g., matched pairs or blocks. Log odds ratio estimates are usually found by
maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by
conditioning on the number of events within each stratum. However, in the analyses of both an
animal experiment and a lung cancer case-control study conditional maximum likelihood (CML)
resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by
using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and
exact or mid-p confidence intervals. Here we suggest and outline point and interval estimation based
on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27
{38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating
some advantages of CFL over competitors. We report on a small-sample simulation study where CFL
log odds ratio estimates were almost unbiased, while LogXact estimates showed some bias and CML
estimates exhibited serious bias. Con¯dence intervals and tests based on the penalized conditional
likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared,
respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary
data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is
available at: http://www.muw.ac.at/msi/biometrie/programs
Neuropathological biomarker candidates in brain tumors: key issues for translational efficiency.
Brain tumors comprise a large spectrum of rare malignancies in children and adults that are often associated with severe
neurological symptoms and fatal outcome. Neuropathological tumor typing provides both prognostic and predictive tissue information
which is the basis for optimal postoperative patient management and therapy.Molecular biomarkers may extend and refine
prognostic and predictive information in a brain tumor case, providing more individualized and optimized treatment options. In the
recent past a few neuropathological brain tumor biomarkers have translated smoothly into clinical use whereas many candidates
show protracted translation. We investigated the causes of protracted translation of candidate brain tumor biomarkers. Considering the
research environment from personal, social and systemic perspectiveswe identified eight determinants of translational success: methodology,
funding, statistics, organization, phases of research, cooperation, self-reflection, and scientific progeny. Smoothly translating biomarkers
are associated with low degrees of translational complexity whereas biomarkers with protracted translation are associated with
high degrees. Key issues for translational efficiency of neuropathological brain tumor biomarker research seem to be related to (i)
the strict orientation to the mission of medical research, that is the improval of medical practice as primordial purpose of research, (ii)
definition of research priorities according to clinical needs, and (iii) absorption of translational complexities by means of operatively
beneficial standards. To this end, concrete actions should comprise adequate scientific education of young investigators, and
shaping of integrative diagnostics and therapy research both on the local level and the level of influential international brain tumor
research platforms.
Gene selection in microarray survival studies under possibly non-proportional hazards
Motivation: Univariate Cox regression (COX) is often used to select genes possibly linked to survival.
With non-proportional hazards (NPH), COX could lead to under- or overestimation of effects. The effect size
measure c = P(T1 < T0) , i. e. the probability that a person randomly chosen from group G1 dies earlier than
a person from G0, is independent of the proportional hazards (PH) assumption. Here we consider its generalization
to continuous data c' and investigate the suitability of c' for gene selection.
Results: Under PH, c' is most efficiently estimated by COX. Under NPH, c' can be obtained by weighted Cox regression
(WHE) or a novel method, concordance regression (CON). The least biased and most stable estimates were obtained by
CON. We propose to use c' as summary measure of effect size to rank genes irrespective of different types of NPH and
censoring patterns.
Availability: WHE and CON are available as R packages.
The estimation of average hazard ratios by weighted Cox regression
Often the effect of at least one of the prognostic factors in a Cox regression model changes over
time, which violates the proportional hazards assumption of this model. As a consequence, the average
hazard ratio for such a prognostic factor is under- or overestimated. While there are several methods to
appropriately cope with non-proportional hazards, in particular by including parameters for time-dependent
effects, weighted estimation in Cox regression is a parsimonious alternative without additional parameters.
The methodology, which extends the weighted k-sample logrank tests of the Tarone-Ware scheme to models with
multiple, binary and continuous covariates, has been introduced in the nineties of the last century and is
further developed and reevaluated in this contribution. The notion of an average hazard ratio is defined and
its connection to the effect size measure emphasized. The suggested approach accomplishes estimation of
intuitively interpretable average hazard ratios and provides tools for inference. A Monte Carlo study confirms
satisfactory performance. Advantages of the approach are exemplified by comparing standard and weighted analyses
of an international lung cancer study. SAS and R programs facilitate application.
Avoiding infinite estimates of time-dependent effects
in small-sample survival studies
We address the phenomenon of monotone likelihood in Cox regression with time-dependent effects.
Monotone likelihood occurs in the fitting process of a Cox model if at least one parameter estimate di-
verges to + infinity. We show that the probability of monotone likelihood is increased by the inclusion
of time-dependent effects, particularly in small samples with several unbalanced and highly predictive
covariates, and with a high percentage of censoring. Firth's bias reduction procedure was shown to
provide an ideal solution to monotone likelihood. Here we extend his idea to Cox regression with
time-dependent effects. By penalized maximum likelihood estimation, finite hazard ratio estimates
of constant and time-dependent effects can be obtained. Penalized likelihood ratio tests and profile
penalized likelihood confidence intervals are proposed as tools for inference. A Monte Carlo study of
Cox regression with time-dependent effects confirms advantages of Firth-corrected over standard Cox
analysis in terms of average bias and median absolute deviation. We also compare the Firth-corrected
and standard Cox approaches by means of analyses of two studies with time-dependent effects. A
SAS macro and an R package for Firth-corrected Cox regression with time-varying covariates and
time-dependent effects are available at: http://www.muw.ac.at/msi/biometrie/programs
Proposals for Sample Size Calculation Programs
Objectives: Numerous sample size calculation programs
are available nowadays. They include both commercial
products as well as public domain and open
source applications. We propose modifications for these
programs in order to even better support statistical consultation
during the planning stage of a two-armed
clinical trial.
Methods: Directional two-sided tests are commonly
used for two-armed clinical trials. This may lead to a
non-negligible Type III error risk in a severely underpowered
study. In the case of a reasonably sized study
the question for the so-called auxiliary alternative may
evolve.
Results: We propose that sample size calculation programs
should be able to compute i) Type III errors and
the so-called q-values, ii) minimum sample sizes
required to keep the q-values below pre-specified
levels, and iii) detectable effect sizes of the so-called
auxiliary alternatives.
Conclusions: Proposals i and ii are intended to help
prevent irresponsibly underpowered clinical trials,
whereas the proposal iii is meant as additional assistance
for the planning of reasonably sized clinical trials.
Toxicokinetic modeling for environmental health problems
Toxicokinetic (TK ) and physiologically based toxicokinetic (PB-TK) models enable tissue dosimetry and the
definition of the target organ dose after exposure to exogenous toxic compounds. This has qualified PB-TK
models for extrapolation from the experimental animal to the human, from high to low doses, between routes of
exposure, between patterns of exposure, and also between robust and susceptible sub-populations. We show how
PB-TK models are constructed, what assumptions are needed, which mathematical methods are used for model
building and by which statistical methods one may assess both model fit and uncertainty of the modeling itself.
We will address in particular a new generation of PB-TK models which include age-dependent model parameters
to model life-long human exposure with an example from the exposure to dioxins. Finally, we define the role of
PB-TK models for the identification of human health effects after exposure to chemicals in risk assessment.
On the translation of uncertainty from toxicokinetic to toxicodynamic models - The TCDD example
When estimating human health risks from exposure to TCDD using toxicokinetic and toxicodynamic models, it is important to
understand how model choice and assumptions necessary for modeling add to the uncertainty of risk estimates. Several toxicokinetic
models have been proposed for the risk assessment of dioxins, in particular the elimination kinetics in humans has been a matter of constant
debate. For a long time, a simple linear elimination kinetics has been common choice. Thus, it was used for the statistical analysis of
the largest occupationally exposed cohort, the German Boehringer cohort.
We challenge this assumption by considering, amongst others, a nonlinear modified Michaelis–Menten-type elimination kinetics, the
so-called Carrier kinetics. Using the area under the lipid TCDD concentration time curve as dose metrics, we model the time to cancerrelated
death using the Cox proportional hazards model as toxicodynamic model. This risk assessment set-up was simulated in order to
quantify uncertainty of both the dose (TCDD body burden) and the risk estimates, depending on the use of the kinetic model, variations
of carcinogenic effect of TCDD and variations of latency period (lag time).
If past exposure is estimated assuming a linear elimination kinetics although a Carrier kinetics actually holds, then high exposures in
reality will be underestimated through statistical analysis and low exposures will be overestimated, respectively. This bias will carry over
on the estimated individual concentration–time curves and the therefrom derived TCDD dose metric values. Using biased dose values
when estimating a dose–response relationship will finally lead to biased risk estimates. The extent of bias and the decrease of precision are
quantified in selected scenarios through this simulation approach. Our findings are in concordance with recent results in the field of
dioxin risk assessment. They also reinforce the general demand for the scheduled uncertainty assessments in risk analyses.
Technical uncertainty in the back-calculation of occupational exposure to dioxins
Members of a cohort of workers in chemical industry (the so-called Boehringer cohort) exposed to 2,3,7,8-
tetrachlorodibenzo-para-dioxin (TCDD) from 1950 to 1984 were subject in the years 1985–1986 and
1992–1994 to an extensive biomonitoring programme on the TCDD levels of the individual workers. For
establishing a dose–response relationship between TCDD-exposure and potentially carcinogenic response,
the individual TCDD concentration–time courses had to be back-calculated over a period of up to more
than four decades. Two back-calculations were attempted for this sophisticated modelling and estimation
task, both based on the same toxicokinetic model but yielding different results.
We demonstrate here by means of a computer simulation study that these differences could be plausibly
explained by the so-called technical uncertainty caused by the employment of differently statistical
estimation techniques. We show that the estimation techniques perform particularly differently in the
presence of workplace misclassification and TCDD measurement error, two complications of exposure
assessment that are with high probability affecting concurrently that cohort’s data.
We conclude that technical uncertainty sensibly enlarges the pool of possible explanations for contradictory
empirical results of complex modelling and estimation approaches and should be considered as
an obligatory uncertainty analysis step after the primary risk analysis evaluation in epidemiological and
environmental studies.
Randomisierte klinische Studien (RCTs) sind am besten geeignet, um die Wirksamkeit von Interventionen zu untersuchen.
Ihre Ergebnisse werden als höchste Evidenzstufe betrachtet. Publikationen von RCTs haben bereits erfolgreich einen Peer-Review
durchlaufen, trotzdem kann man nicht ausschließen, dass noch bedeutende unentdeckte Mängel in der Studie vorhanden sind.
Nach wie vor obliegt es den Lesern, die Qualität der Publikation zu beurteilen und zu fragen, ob die publizierten Ergebnisse auf
ihre Patienten anwendbar sind. Die wichtigsten Punkte einer solchen kritischen Abschätzung werden besprochen und mit einem
Schwerpunkt auf Brustkrebsstudien diskutiert.
Basic principles in the planning of clinical trials in surgical oncology
Background: ICH (International Conference on Harmonisation of Technical Requirements for Registration
of Pharmaceuticals for Human Use) provides guidelines on the implementation of clinical trials. All study
participants are obliged to follow these guidelines in line with “Good Clinical Practice”.
Methods: The main features of a clinical study include the following items: Background and general aims,
specific objectives, patient selection criteria, treatment schedules, methods of patient evaluation, trial design,
registration and randomization of patients, patient consent, required size of study, monitoring of trial progress,
forms and data handling, protocol deviations, plans for statistical analysis and administrative responsibilities.
Results: All items mentioned above should already be discussed in the planning stage of a clinical trial and
addressed in the study protocol. The study protocol provides a guideline for any person involved in the trial.
Conclusions: For the success of a clinical trial, it is especially important to have a clear and exact definition
of the study hypotheses and to choose primary and secondary endpoints very carefully.
Full paper pdf-file (mit Adobe Acrobat 7.0 erstellt)
Gene expression profiling: Does it add predictive accuracy to clinical characteristics in cancer prognosis?
It is widely accepted that gene expression classifiers need to be externally validated by showing that they predict the
outcome well enough on other patients than those from whose data the classifier was derived. Unfortunately, the gain
in predictive accuracy by the classifier as compared to established clinical prognostic factors often is not quantified.
Our objective is to illustrate application of appropriate statistical measures for this purpose. In order to compare the
predictive accuracies of a model based on the clinical factors only and of a model based on the clinical factors plus
the gene classifier, we compute the decrease in predictive inaccuracy and the proportion of explained variation.
These measures have been obtained for three studies of published gene classifiers: for survival of lymphoma patients,
for survival of breast cancer patients, and for the diagnosis of lymph node metastases in head and neck cancer.
For the three studies our results indicate varying and possibly small added explained variation and predictive accuracy
due to gene classifiers. Therefore, the gain of future gene classifiers should routinely be demonstrated by appropriate
statistical measures, such as the ones we recommend.
Parsimonious analysis of time-dependent effects in the Cox model
Cox's proportional hazards model can be extended to accommodate time-dependent effects
A simulation study comparing properties of heterogeneity measures in meta-analyses
The assessment of heterogeneity or between-study variance is an important issue
in meta-
analysis. It determines the statistical methods to be used and the interpretation
of the
results. Tests of heterogeneity may be misleading either due to low power for
sparse data or
to the detection of irrelevant amounts of heterogeneity when many studies are
involved. In
the former case, notable heterogeneity may remain unconsidered and an unsuitable
model
may be chosen and the latter case may lead to unnecessary complex analyses strategies.
Measures of heterogeneity are better suited to determine appropriate analyses
strategies.
We review two measures with different scaling and compare them with the heterogeneity
test. Estimates of the within-study variance are discussed and a new total information
measure is introduced. Various properties of the quantities in question are
assessed by a
simulation study.
Heterogeneity test and measures are not directly related to the amount of between-study
variance but to the relative increase of variance due to heterogeneity. It is
more favorable to
base the within-study variance estimate on the squared weights of individual
studies than
on the sum of weights. A heterogeneity measure scaled to a fixed interval needs
reference
values for proper interpretation. A measure defined by the relation of between-
to within-
study variance has a more natural interpretation but no upper limit. Both measures
are
quantifications of the impact of heterogeneity on the meta-analysis result as
both depend
on the variance of the individual study effects and thus on the number of patients
in the
studies.
Full Paper pdf-file (mit Adobe Acrobat 7.0 erstellt)
A comparative investigation of methods for logistic regression with separated or nearly separated data
In logistic regression analysis of small or sparse data sets, results obtained by classical maximumFull Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
On the role of ex post uncertainty assessment for risk management
A risk management decision whether a chemical compound presentFull Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Careful Use of Pseudo R-Squared Measures in Epidemiological Studies
Many epidemiological research problems deal with large numbers
of exposed subjects of whom only a small number actually suffer the
adverse event of interest. Such rare events data can be analysed by
employing an approximate Poisson model. The objective of this study
is to challenge the interpretability of the corresponding Poisson pseudo
R-squared measure. It will lack sensible interpretation whenever the
approximate Poisson outcome is generated by counting the number
of events within covariate patterns formed by cross-tabulating categorical
covariates. The failure is caused by the immanent arbitrariness in the
definition of the covariate patterns, that is, independent Bernoulli events,
B(1,p), are arbitrarily combined into binomially distributed ones, B(n,p),
which are then approximated by the Poisson model.
Full Paper pdf-file (Adobe Acrobat 7.0)
Pseudo R-squared measures for Poisson regression models with over- or underdispersion
The Poisson regression model is frequently used to analyze count data.
Pseudo R-squared measures for Poisson regression models have recently
been proposed and bias adjustments recommended in the presence of small
samples and/or a large number of covariates. In practice, however, data are
often over- or sometimes even underdispersed as compared to the standard
Poisson model. The de5nition of Poisson R-squared measures can be applied
in these situations as well, albeit with bias adjustments accordingly adapted.
These adjustments are motivated by arguments of quasi-likelihood theory.
Properties of unadjusted and adjusted R-squared measures are studied by
simulation under standard Poisson; over- and underdispersed Poisson regression
models and theiruse is exempli5ed and discussed with popcorn data.
Exact logrank tests for unequal follow-up
The asymptotic log rank and generalized Wilcoxon tests are the
standard procedures for comparing samples of possibly censored
survival times. For comparison of samples of very different sizes,
an exact test is available that is based on a complete permutation
of log rank or Wilcoxon scores. While the asymptotic tests do not
keep their nominal sizes if sample sizes differ substantially, the
exact complete permutation test requires equal follow-up of the
samples. Therefore we have developed and present two new exact
tests also suitable for unequal follow-up. The first of these is
an exact analogue of the asymptotic log rank test and conditions
on observed risk sets, whereas the second approach permutes
survival times while conditioning on the realized follow-up in
each group. In an empirical study we compare the new procedures
with the asymptotic log rank test, the exact complete permutation
test, and an earlier proposed approach which equalizes the
follow-up distributions by artificial censoring. Results confirm
highly satisfactory performance of the exact procedure
conditioning on realized follow-up, particularly in case of
unequal follow-up. The advantage of this test over other options
of analysis is finally exemplified in the analysis of a breast
cancer study.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Comparing Cox and Parametric Models in Clinical Studies
Parametric models are only occasionally used in the analysis of clinical studies
of survival although they may offer advantages over Cox's model. In this paper
we
report experiences which we have made fitting parametric models to data sets
from
different clinical trials mainly performed at the Vienna University Medical
School.
We emphasise the role of residuals for discriminating among candidate models
and
judging their goodness of fit. The effect of misspecification of the baseline
distribution on parameter estimates and testing has been explored. The results
from
parametric analyses have always been contrasted with those from Cox's model.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A Measure of Dependence for the Stratified Cox Proportional Hazards Regression Model
KENT and O'QUIGLEY (1988) apply the concept of information gain to measure
both
global and partial dependence between explanatory variables and a censored
response within the framework of the proportional hazards regression model of
Cox
(1972). The definition of this measure is extended to cover also the stratified
Cox
model.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Adjusted R² Measures for the Inverse Gaussian Regression Model
The R-2 measure is a commonly used tool for assessing the predictive ability
of a
linear regression model. It quantifies the amount of variation in the outcome
variable,
which is explained by the covariates. Various attempts have been made to carry
the
R-2 definition to other types of regression models as well. Here, two different
R-2
measure definitions for the Inverse Gaussian regression model will be studied.
They
are motivated by deviance and sums-of-squares residuals. Depending on sample
size and number of covariates fitted, these R-2 measures may show substantially
inflated values, and a proper bias-adjustment is necessary. Several possible
adjusted
R-2 measure definitions for the Inverse Gaussian regression model will be compared
in a simulation study. The use of adjusted R-2 measures is recommended in general.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Calculating Adjusted R2 Measures for Poisson Regression Models
In regression models not only the parameter estimates and significances of
explanatory variables are of interest, but also the degree to which variation
in the
dependent variable can be explained by covariates. In recent publications an
R2
measure based on deviance was recommended for Poisson regression models, one
of the most frequently used modelling tools in epidemiological studies. However,
when sample size is small relative to the number of covariates in the model,
simple
R2 measures may be seriously inflated and may need to be adjusted according
to the
number of covariates in the model.
We present a SAS-macro which calculates adjustments for the R2 measures in
Poisson regression models based on log-likelihood and on sums of squares. The
proposed measures are applied to real data sets and their performance is discussed.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Measures of Explained Variation in Gamma Regression Models
The common R2 measure provides a useful means to quantify the degree to which
variation in the dependent variable can be explained by the covariates in a
linear
regression model. Recently, there have been various attempts to apply the definition
of the R2 measure to generalized linear models. This paper studies two different
R2
measure definitions for the gamma regression model. These measures are related
to
deviance and sum-of-squares residuals. Depending on the sample size and the
number of covariates fitted, so-called unadjusted R2 measures may be substantially
inflated, and the use of adjusted R2 measures is then preferred. We study several
known adjustments previously proposed for R2 measures in regression models and
illustrate the effect on the two unadjusted R2 measures for the gamma regression
model. Comparing the resulting measures with underlying population values, we
find
the best adjustment via simulation.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Predictive accuracy and explained variation
Measures of the predictive accuracy of regression models quantify the extent
to
which covariates determine an individual outcome. Explained variation measures
the
relative gains in predictive accuracy when prediction based on covariates replaces
unconditional prediction. A unified concept of predictive accuracy and explained
variation based on the absolute prediction error is presented for models with
continuous, binary, polytomous and survival outcomes. The measures are given
both
in a model-based formulation and in a formulation directly contrasting observed
and
expected outcomes. Various aspects of application are demonstrated by examples
from three forms of regression models. It is emphasized that the likely degree
of
absolute or relative predictive accuracy often is low even if there are highly
significant
and relatively strong covariates.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Comparing the importance of prognostic factors in Cox and logistic regression using SAS
Two SAS macro programs are presented that evaluate the relative importance
of prognostic
factors in the proportional hazards regression model and in the logistic regression
model. The
importance of a prognostic factor is quantified by the proportion of variation
in the outcome
attributable to this factor. For proportional hazards regression, the program
%RELIMPCR
uses the recently proposed measure V to calculate the proportion of explained
variation (PEV).
For the logistic model, the R2 measure based on squared raw residuals is used
by the program
%RELIMPLR. Both programs are able to compute marginal and partial PEV, to compare
PEVs
of factors, of groups of factors, and even to compare PEVs of di®erent models.
The programs use
a bootstrap resampling scheme to test di®erences of the PEVs of di®erent
factors. Confidence
limits for P-values are provided. The programs further allow to base the computation
of PEV
on models with shrinked or bias-corrected parameter estimates. The SAS macros
are freely
available at the WWW site www.meduniwien.ac.at/msi/biometrie/relimp.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
zurück zu Publikationen
Fixing the nonconvergence bug in logistic regression with SPLUS and SAS
When analyzing clinical data with binary outcomes, the parameter estimates
and
consequently the odds ratio estimates of a logistic model sometimes do not converge
to finite
values. This phenomenon is due to special conditions in a data set and known
as separation.
Statistical software packages for logistic regression using the maximum likelihood
method cannot
appropriately deal with this problem. A new procedure to solve the problem has
been proposed
by Heinze and Schemper (2001). It has been shown that unlike the standard maximum
likelihood
method, this method always leads to finite parameter estimates. We developed
a SAS macro and
an SPLUS library to make this method available from within one of these widely
used statistical
software packages. Our programs are also capable of performing interval estimation
based on
profile penalized log likelihood (PPL) and of plotting the PPL function as was
suggested by
Heinze and Schemper (2001).
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
zurück zu Publikationen
Predictive Accuracy and Explained Variation in Cox Regression
We suggest a new measure of the proportion of the variation of possibly censored
survival times explained by a given proportional hazards model. The proposed
measure, termed V, shares several favourable properties (cf. Henderson, 1995,
Statistics in Medicine 14, 161-184) with an earlier V1 (Schemper, 1990, Biometrika
77, 216-218) but also improves the handling of censoring. The statistic contrasts
distance measures between individual 1/0 survival processes and fitted survival
curves with and without covariate information. These distance measures, Dx and
D
respectively, are themselves informative as summaries of absolute rather than
relative predictive accuracy. We recommend graphical comparisons of survival
curves for prognostic index groups to improve the understanding of obtained
values
for V, Dx and D. Their use and interpretation is exemplified for a Yorkshire
lung
cancer study on survival. From this and an overview for several well known clinical
data sets, we show that the likely amount of relative or absolute predictive
accuracy
is often low even if there are highly significant and relatively strong prognostic
factors.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Adjustments for R2-Measures for Poisson regression models
In regression models not only the parameter estimates and significances of
explanatory variables are of interest, but also the degree to which variation
in the
dependent variable can be explained by covariates. In recent publications an
R2-
measure based on deviance was recommended for Poisson regression models, one
of the most frequently used modelling tools in epidemiological studies. However,
when sample size is small relative to the number of covariates in the model,
simple
R2-measures may be seriously inflated and may need to be adjusted according
to
the number of covariates in the model. Two new adjustments for the R2-measure
in
Poisson regression models based on deviance residuals are presented and
compared by simulation with population values. The proposed measures are also
applied to real data sets.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Using SAS to calculate the Kent and O'Quigley measure of dependence for Cox proportional hazards regression model
Kent and O'Quigley (1988) apply the concept of information gain to define a
measure
of dependence (R-squared measure) between explanatory variables and a censored
response variable within the framework of the Cox model. Two SAS macros to
calculate this measure are presented. The first one is based on a Newton-Raphson
search and makes use of the SAS IML procedure. The second one is a simple grid
search using SAS DATA steps and Base-SAS procedures.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Computing measures of explained variation for logistic regression models
The proportion of explained variation in logistic regression can be suitably
expressed
by the multiple R2 originally developed for the general linear model (cf. Mittlböck
and
Schemper (1996) ). In this paper we present a detailed investigation of this
measure
in small samples and/or with many covariates and propose either of two adjustments,
one being a direct analogue of R2adj of the general linear model, and the other
being based on shrinkage. Furthermore we explore the use of bootstrap confidence
intervals and give a table of expected variability of estimates of explained
variation for
samples of varying sizes. We recommend to quantify gains of predictive precision
due to prognostic factors by both, relative and absolute measures. For binary
outcomes the components of the relative measure, R2 , are suitable absolute
measures of predictive precision. They are interpretable as average absolute
residuals conditional on using prognostic factors and without such information.
Application of the presented measures is motivated by the statistical analysis
of a
study of physical characteristics of urine possibly related to the presence
of calcium
oxalate crystals.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A new approach to estimate correlation coefficients in the presence of censoring and proportional hazards.
Estimation of correlations when one of the variables is censored has received
only
little attention in the past. This presentation reviews the deficiencies of
existing
approaches and presents an algorithm for the reconstruction of Spearman, Kendall
and other nonparametric correlation coefficients from censored samples. The
algorithm uses Rubin's technique of multiple imputation and assumes proportionality
of hazards as does Cox's model analyses by which it is supposed to supplement.
The unbiasedness of the estimation procedure under proportional hazards is
demonstrated even for underlying correlations of 0.9 and 90% censoring. Two
important applications of the presented procedure are exemplified: the estimation
of
the correlation of survival time with a prognostic factor and the estimation
of the
variation of survival explained by prognostic factors within Cox's model.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Explained variation in survival analysis.
Several measures of explained variation have been suggested for the Cox
proportional hazards regression model. We have categorized these measures into
three classes which correspond to three different definitions of multiple R2
of the
general linear model. In an empirical study we compared the performance of these
measures and classified them by their adherence to a set of criteria which we
think
should be met by a measure of explained variation for survival data. We suggest
that
currently there is no uniformly superior measure, particularly as the concepts
of either
uncensored or censored populations may lead to different choices. For uncensored
populations, a measure by Kent and O'Quigley and the squared rank correlation
between survival time and the predictor from a Cox regression model appear
recommendable choices. For the latter, censored survival times are terminated
using
a very recent data augmentation algorithm for multiple imputation under proportional
hazards. With censored populations, Schemper's measure, V2, could be considered.
We give an introductory example, discuss aspects of application and stress the
desirability of routinely evaluating explained variation in studies of survival.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Explained variation for logistic regression.
Different measures of the proportion of variation in a dependent variable explained
by
covariates are reported by different standard programs for logistic regression.
We
review twelve measures that have been suggested or might be useful to measure
explained variation in logistic regression models. The definitions and properties
of
these measures are discussed and their performance is compared in an empirical
study. Two of the measures (squared Pearson correlation between the binary
outcome and the predictor, and the proportional reduction of squared Pearson
residuals by the use of covariates) give almost identical results, agree very
well with
the multiple R2 of the general linear model, have an intuitively clear interpretation
and
perform satisfactorily in our study. For all measures the explained variation
for the
given sample and also the one expected in future samples can be obtaind easily.
For
small samples an adjustment analogous to R2adjin the general linear model is
suggested. We discuss some aspects of application and recommend the routine
use
of a suitable measure of expained variation for logistic models.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
The relative importance of prognostic factors in studies of survival.
The relative importance of prognostic factors in regression can be measured
either
by standardized regression coefficients or by percentages of explained variation
in a
dependent variable. One advantage of using explained variation is the direct
comparability of qualitative prognostic factors with others, or of groups of
prognostic
factors. The description of relative importance can be accomplished within marginal
or partial effects analyses. It is demonstrated that it is possible not only
to provide a
descriptive ranking of prognostic factors according to their statistically determined
importance, but also to make inferences concerning their relative importance,
employing bootstrap techniques and precedures for multiple comparisons. The
methods presented, which are new in the context of Cox regression, are exemplified
by analyses of studies of lung cancer and breast cancer.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Further results on the explained variation in proportional hazards regression.
Two competing measures of the proportion of variation of possibly censored
survival
times explained by a given proportional hazards model are compared in a Monte
Carlo study. It is shown that the validity of a very quick likelihood based
measure
(Magee, 1990) depends on a proportionality assumption. The computationally more
involved measure V2 of Schemper (1990) is robust in this respect. Both measures
produce very close results if the Cox model assumptions are perfectly met.
The explained variation in proportional hazards regression.
Two new measures of the proportion of the variation of possibly censored survival
times explained by a given proportional hazards model are presented. It is shown
that the previously suggested proportion of explained log likelihood is not
a useful
measure of predictive precision. The new measures are intuitively appealing
and
sufficiently simple for routine application.
New Residuals for Cox Regression and Their Application to Outlier Screening
The identification of individuals who 'died far too early' or 'lived far too
long' as
compared to their survival probabilities from a Cox regression can lead to the
detection of new prognostic factors. Methods to identify outliers are generally
based
on residuals. For Cox regression only deviance residuals have been considered
for
this purpose but we show that these residuals are not very suitable. Instead,
we
develop and propose two new types of residuals: the suggested log-odds and normal
deviate residuals are simple, intuitively appealing and their theoretical properties
and
empirical performance make them very suitable for outlier identification. Finally,
various practical aspects of screening for individuals with outlying survival
times are
discussed by means of a cancer study example.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
The Interpretation of Clinical Trials of Immediate Versus Delayed Therapy
Prospective, randomised clinical comparisons with a control group are the ideal
way
to evaluate the effectiveness of a new therapy. However if the new therapy is
already
available, then it may be unethical to refuse patients this treatment indefinitely.
In
some trials, patients randomised to the control group receive placebo until
their
condition deteriorates, and are then switched to the new therapy. Patients
randomised to the experimental group receive the new treatment immediately.
An
analysis following the intention-to-treat principle will be a valid comparison
of the two
treatment policies actually used. However, such an analysis will underestimate
the
effect of immediate therapy relative to completely untreated controls, and in
particular
a negative conclusion should not be interpreted to mean that the therapy is
ineffective.
In this paper we introduce a parametric approach, which models the dependence
between the survival time and the time of switching to the new treatment. This
is
used first to illustrate the lack of power of the intention-to-treat analysis
for evaluating
the therapy relative to a pure control. More specualtively, an alternative method
of
analysis based on our model is presented. We illustrate the issues with data
from a
prospective randomised study, in which one group of HIV-positive patients received
zidovudine immediately after randomisation while control patients were switched
to
zidovudine only when their condition deteriorated.
Gaining more flexibility in Cox proportional hazards regression models
with
cubic spline functions
The Cox proportional hazards model is the most popular model for the analysis
of
survival data. The use of cubic spline functions allows investigation of non-linear
effects of continuous covariates and flexible assessment of time-by-covariate
interactions. Two main advantages are provided - no particular functional form
has to
be specified and standard computer software packages like SAS or BMDP can be
used. A SAS macro which implements the method is presented.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Assessing interactions of binary time-dependent covariates with time in
Cox
proportional hazards regression models using cubic spline functions.
The Cox proportional hazards model is the most popular model for the analysis
of
survival data. Time-dependent covariates can be included in a straightforward
manner, In most cases such covariates will be binary, indicating some form of
changing group membership, with individuals starting in group 0, and changing
into
group 1 after the occurrence of a specific event. If there is evidence that
the hazard
ratio between these two groups depends on the sojourn time in group 1, then
the use
of cubic spline functions will allow investigation of the shape of the supposed
effect
and provide two main advantages - no particular functional form has to be specified
and standard computer software packages like SAS or BMDP can be used.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A note on quantifying follow-up in studies of failure time.
No summery available.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Cox analysis of survival data with non-proportional hazard functions.
The consequences of violated assumptions for Cox's proportional hazards model
are
discussed and current options to deal with non-proportionality in Cox's model
are
reviewed. An additional option for analysis is suggested, which produces weighted
estimates of log hazard ratios, weighted at the time points where failures occur.
The
procedure amounts to generalizations of the tests by Breslow or Prentice for
multiple
covariates in the same manner that the proportional hazards model is a
generalization of the log rank test by Mantel. Its advantages are representative
estimates of average hazard ratios also for covariates with non-proportional
and, in
particular, converging hazard functions. The latter are often encountered in
clinical
applications. By means of an empirical study these average hazard ratios are
shown
to be very close to exact calculations of average hazard ratios as defined by
Kalbfleisch and Prentice. Two examples illustrate the advantages of the weighted
estimation and of other strategies for analysis with the Cox model in the presence
of
non-proportional hazards. Furthermore, with respect to checking proportionality,
it is
demonstrated how misleading the frequently used log-minus-log plots can be and
that the lesser known Arjas plots seem to perform quite well.
Generalisation of the Kaplan-Meier estimator for standardised processes.
The Kaplan-Meier estimator of the survival function is interpreted as an average
process estimator. Under this perspective a generalised Kaplan-Meier estimator
is
presented for analysing continuous, multievent and even nonmonotonic processes
of
the change in degreee of function, rather than for the single and 100% loss
of
function process of standard survival analysis. Furthermore estimates of the
survival
function in the presence of uncertain or multiple causes of death can be obtained.
Worked examples for the estimator and for a corresponding bootstrap variance
estimate are given.
Probability imputation revisited for prognostic factor studies.
The analysis of prognostic factor stucies by Cox or logistic regression models
is often
impeded by missing covariate values. In 1990 Schemper and Smith recommended
a
conditional probability imputation technique (PIT) for the analysis of treatment
studies
which can be easily applied using standard software and which has been
demonstrated to outperform the complete case and omission of covariates strategies.
Recent research, however, showed that PIT cannot universally be recommended
and
it was concluded that model-based methods should be preferred. We agree with
these conclusions but also think that there is enough empirical evidence to
judge the
performance of PIT to be satisfactory in typical prognostic factor studies.
Furthermore, comparisons of PIT with multiple imputation in the same context
did not
indicate an advantage of the latter more involved technique. By means of an
analysis
of a prostate cancer data set various aspects of application of PIT are discussed,
in
particular that PIT permits direct comparability of marginal and partial effects
analyses. We conclude that PIT continues to be an appropriate and attractive
choice
for analysis of prognostic factor studies.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Efficient evaluation of treatment effects in the presence of missing covariate values.
In clinical trials, treatment comparisons are often performed by models that
incorporate important prognostic factors. Since these models require complete
covariate information on all patients, statisticians frequently resort to complete
case
analysis or to omission of an important covariate. A probability imputation
technique
(PIT) is proposed that involves substituting conditional probabilities for misssing
covariate values when the covariate is qualitative. Simulation results are presented
which demonstrate that the method neither violates the size of the treatment
test nor
introduces additional bias for the estimation of the treatment effect. It allows
use of
standard software. A clinical trial of breast cancer treatment, in which an
important
covariate was partly missing, was analysed by Cox's model. the use of PIT resulted
in smaller observed error probability conpared with case deletion, and sensitivity
analysis supported these results.
SAS and SPLUS programs to perform Cox regression without convergence
problems
When analyzing survival data, the parameter estimates and consequently the
relative
risk estimates of a Cox model sometimes do not converge to finite values. This
phenomenon is due to special conditions in a data set and is known as 'monotone
likelihood'. Statistical software packages for Cox regression using the maximum
likelihood method cannot appropriately deal with this problem. A new procedure
to
solve the problem has been proposed by Heinze and Schemper (2001). It has been
shown that unlike the standard maximum likelihood method, this method always
leads to finite parameter estimates. We developed a SAS macro and an SPLUS
library to make this method available from within one of these widely used statistical
software packages. Our programs are also capable of performing interval estimation
based on profile penalized log likelihood (PPL) and of plotting the PPL function
as
was suggested by Heinze and Schemper.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A Solution to the Problem of Monotone Likelihood in Cox Regression
The phenomenon of monotone likelihood is observed in the fitting process of
a Cox
model if the likelihood converges to a finite value while at least one parameter
estimate diverges to + infinity. Monotone likelihood primarily occurs in small
samples
with substantial censoring of survival times and several highly predictive covariates.
Previous options to deal with monontone likelihood have been unsatisfactory.
The
solution we suggest is an adaptation of a procedure by Fith (1993. Biometrika
80, 27-
38) originally developed to reduce the bias of maximum likelihood estimates.
This
procedure produces finite parameter estimates by means of penalized maximum
likelihood estimation. Corresponding Wald-type tests and confidence intervals
are
available but it is shown that penalized likelihood ratio tests and profile
penalized
likelihood confidence intervals are often preferable. An empirical study of
the
suggested procedures confirms satisfactory performance of both, estimation and
inference. The advantage of the procedure over previous options of analysis
is finally
exemplified in the analysis of a breast cancer study.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A Note on Testing Areas Under the Curve when Using Destructive Measurement Techniques
The area under the curve of drug concentration over time is considered important
in
many toxicological, pharmacological and medical investigations. In case that
only one
measurement for each experimental unit has been recorded, the area under the
curve has to be estimated on the basis of the mean concentration values at the
measurement times. Defining an estimator by using a linear combination of these
mean values enables the straightforward estimation of the corresponding variance.
For testing the null hypothesis of equality between two areas under the curve
the use
of Welch's test for nonpairwise contrasts is recommended; the previously suggested
normal approximation of the test statistic is not an adequate choice since it
exceeds
the nominal alpha-level; this is illustrated by simulation studies.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
Randomization for clinical trials.
No summary available.
Stellung und Aufgaben der Biometrie in der klinischen und experimentellen Forschung.
Auf eine wissenschaftstheoretische Betrachtung der Funktion der Biometrie im
Rahmen der klinischen Forschung folgt eine Reflexion über Faktoren, die
die
Effizienz klinisch-biometrischer Beratung und Betreuung beeinflussen können.
Vorrangig werden dabei verschiedene Verhaltensmuster von Arzt und Biometriker
diskutiert sowie daraus resultierende Vor- und Nachteile für den kooperativen
Erkenntnisprozeß. Weitere Faktoren für eine zielführende Arbeitspraxis
werden in der
Heranbildung biometrischen Nachwuchses und in organisatorisch-materiellen
Grundlagen der klinischen Biometrie gesehen. Überlegungen zur Zukunft der
Disziplin beschließen die Arbeit.
Explained Variation for Logistic Regression - Small Sample Adjustments, Confidence Intervals and Predictive Precision
The proportion of explained variation in logistic regression can be suitably
expressed
by the multiple R² originally developed for the general linear model (cf.
Mittlböck and
Schemper (1996) ). In this paper we present a detailed investigation of this
measure
in small samples and/or with many covariates and propose either of two adjustments,
one being a direct analogue of of the general linear model, and the other being
based
on shrinkage. Furthermore, we explore the use of bootstrap confidence intervals
and
give a table of the expected variability of estimates of explained variation
for samples
of varying sizes. We recommend to quantify gains of predictive precision due
to
prognostic factors by both relative and absolute measures. For binary outcomes
the
components of the relative measure, , are suitable absolute measures of predictive
precision. They are interpretable as average absolute residuals conditional
on using
prognostic factors and without such information. We motivate application of
the
presented measures by the statistical analysis of a study of physical characteristics
of
urine possibly related to the presence of calcium oxalate crystals.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A Solution to the Problem of Separation in Logistic Regression
The phenomenon of separation or monotone likelihood is observed in the
fittingprocess of a logistic model if the likelihood converges while at least
one
parameter estimate diverges to ± infinity. Separation primarily occurs
in small
samples with several unbalanced and highly predictive risk factors. A procedure
by
Firth originally developed to reduce the bias of maximum likelihood estimates
is
shown to provide an ideal solution to separation. It produces finite parameter
estimates by means of penalized maximum likelihood estimation. Corresponding
Wald tests and confidence intervals are available but it is shown that penalized
likelihood ratio tests and profile penalized likelihood confidence intervals
are often
preferable. The clear advantage of the procedure over previous options of analysis
is
impressingly demonstrated by the statistical analysis of two cancer studies.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A Cautionary Note on Segmenting a Cyclical Covariate by Minimum P-value Search
Recently, menstrual status at the time of surgery was suggested to be a potential
prognostic factor for survival in premenopausal women suffering from breast
cancer.
That is, surgery should be avoided in a certain segment of the menstrual cycle.
However, besides that different authors claimed different segments to be hazardous,
the alleged influence on survival could hardly be confirmed by subsequent studies.
Statistical arguments could provide an explanation for these contradictory findings.
Splitting a cyclical covariate into two segments is analogous to dichotomizing
a
continuous covariate. Given that this splitting is based on a minimum P-value
search,
the actual type I error rate will be much higher than the nominal one due to
multiple
testing. A simulation study has been performed to gain insight into the problem.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)
A Note on R² Measures for Poisson and Logistic Regression Models when Both Models are Applicable
The aim of many epidemiological studies is the regression of a dichotomous
outcome
(e.g., death or affection by a certain disease) on prognostic covariables. Thereby
the
Poisson regression model is often used alternatively to the logistic regression
model.
Modelling the number of events and individual outcomes, respectively, both models
lead to nearly the same results concerning the parameter estimates and their
significances. However, when calculating the proportion of explained variation,
quantified by an R2 measure, a large difference between both models usually
occurs.
We illustrate this difference by an example and explain it with theoretical
arguments.
We conclude, the R2 measure of the Poisson regression quantifies the predictability
of event rates, but it is not adequate to quantify the predictability of the
outcome of
individual observations.
Full Paper pdf-file (mit Adobe Acrobat 6.0 erstellt)