-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathREADME.Rmd
More file actions
129 lines (98 loc) · 4.65 KB
/
README.Rmd
File metadata and controls
129 lines (98 loc) · 4.65 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
output: github_document
bibliography: vignettes/detectseparation.bib
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# detectseparation <img src="man/figures/hex_detectseparation.svg" width="320" align="right">
<!-- badges: start -->
[](https://CRAN.R-project.org/package=detectseparation)
[](https://www.gnu.org/licenses/gpl-3.0.en.html)
[](https://github.com/ikosmidis/detectseparation/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/ikosmidis/detectseparation)
<!-- badges: end -->
[**detectseparation**](https://github.com/ikosmidis/detectseparation)
provides *pre-fit* and *post-fit* methods for the detection of
separation and of infinite maximum likelihood estimates in binomial
response generalized linear models.
The key methods are `detect_infinite_estimates()`, `detect_separation()` and `check_infinite_estimates()`.
## Installation
You can install the released version of detectseparation from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("detectseparation")
```
and the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("ikosmidis/detectseparation", ref = "develop")
```
## Detecting and checking for Infinite maximum likelihood estimates
@heinze+schemper:2002 used a logistic regression model to analyze data
from a study on endometrial cancer [see, @agresti:2015, Section 5.7 or
`?endometrial` for more details on the data set]. Below, we refit the
model in @heinze+schemper:2002 in order to demonstrate the
functionality that **detectseparation** provides.
```{r example}
library("detectseparation")
data("endometrial", package = "detectseparation")
endo_glm <- glm(HG ~ NV + PI + EH, family = binomial(), data = endometrial)
theta_mle <- coef(endo_glm)
summary(endo_glm)
```
The maximum likelihood (ML) estimate of the parameter for `NV` is actually
infinite. The reported, apparently finite value is merely due to false
convergence of the iterative estimation procedure. The same is true
for the estimated standard error, and, hence the value `r round(coef(summary(endo_glm))["NV", "z value"], 3)` for the $z$-statistic cannot be trusted for inference on the size of the effect for `NV`.
### `detect_separation()`
`detect_separation()` is a *pre-fit* method, in the sense that it does not
need to estimate the model to detect separation and/or identify
infinite estimates. For example,
```{r, eval = TRUE, echo = TRUE}
endo_sep <- glm(HG ~ NV + PI + EH, data = endometrial,
family = binomial("logit"),
method = "detect_separation")
endo_sep
```
So, the actual maximum likelihood estimates are
```{r, echo = TRUE, eval = TRUE}
coef(endo_glm) + coef(endo_sep)
```
and the estimated standard errors are
```{r, echo = TRUE, eval = TRUE}
coef(summary(endo_glm))[, "Std. Error"] + abs(coef(endo_sep))
```
We can ask `detect_separation()` not only to detect separation, but
also, if separation is detected, to solve an additional linear program
to check whether separation is complete or quasi-complete. We do this
by setting `separation_type = TRUE` in the `glm()` call
```{r, echo = TRUE, eval = TRUE}
glm(HG ~ NV + PI + EH, data = endometrial, family = binomial("logit"),
method = "detect_separation", separation_type = TRUE)
```
We can of course, simply, use the update method on the `glm` object,
if that is available, and change the `method` argument with any extra
options. For example,
```{r, echo = TRUE, eval = TRUE}
update(endo_glm, method = "detect_separation", separation_type = TRUE)
```
### `check_infinite_estimates()`
@lesaffre+albert:1989[, Section 4] describe a procedure that can hint
on the occurrence of infinite estimates. In particular, the model is
successively refitted, by increasing the maximum number of allowed
iteratively re-weighted least squares iterations at each step. The
estimated asymptotic standard errors from each step are, then, divided
to the corresponding ones from the first fit. If the sequence of
ratios diverges, then the maximum likelihood estimate of the
corresponding parameter is minus or plus infinity. The following code
chunk applies this process to `endo_glm`.
```{r, echo = TRUE, eval = TRUE}
(inf_check <- check_infinite_estimates(endo_glm))
plot(inf_check)
```
# References