13  Unfinished Odds and Ends

13.1 Probability, Odds, and Logits

13.1.1 Odds

Most of us are comfortable with the notion of probability, but we find odds to be a little harder to understand. In everyday language, people use probability and odds interchangeably. Technically, they are related, but not at all the same thing.

Probability is the ratio of the number outcomes in which the event of interest occurrs over the total number of outcomes.Odds refers to the ratio of the number of outcomes in which the event occurs to the number of outcomes in which the event does not occur.

When the odds of winning are 3 to 1, there will be 3 wins for every loss.

Figure 13.1: The relationship between probability and odds. Note that probability is bounded by 0 and 1, whereas odds can range from 0 to positive infinity.

As seen in Figure 13.1, the relationship between probability and odds is not linear. To convert an odds ratio to probability:

p=\frac{odds}{1 + odds}

# The relationship between probability and odds
tibble(p = seq(0, .91, .01),
       odds = p / (1 - p)) %>%
  ggplot(aes(p, odds)) +
  geom_arrow(color = myfills[1],
             arrow_head = my_arrowhead) +
  scale_x_continuous("Probability",
                     breaks = seq(0, 1, .2),
                     labels = prob_label,
                     limits = c(0, 1)) +
  scale_y_continuous("Odds", breaks = seq(0, 10, 2)) +
  theme_minimal(base_size = 30, base_family = bfont)

Thus, 3 to 1 odds is:

odds <- 3 / 1

p <- odds / (1 + odds)
p
[1] 0.75

Or, 2 to 3 odds is:

odds <- 2 / 3
p <- odds / (1 + odds)
p
[1] 0.4

To convert probability to odds, one can simply use this formula:

Odds = \frac{p}{1-p}

If the probability of an event is .8, then the odds are:

p <- .8
odds <- p / (1 - p)
odds
[1] 4

Probabilities bounded by 0 and 1, inclusive. Odds have a minimum of 0 but have no upper bound (See Figure 13.1).

In many statistical applications, we need to convert a probability to a value that has neither upper nor lower bounds. A logit maps probabilities onto real numbers from negative to positive infinity (See Figure 13.2).

A logit is a portmanteau of log unit. In item-response theory, ability and item difficulty are expressed in terms of logits. Logits are sometimes called the log-odds because they are calculated as the (natural) log of the odds: \text{logit}(p)=\ln\left(\frac{p}{1-p}\right)
Figure 13.2: The relationship between probability and logits. Whereas probability is bounded by 0 and 1, logits range from negative to positive infinity.

A logit is the log-odds of probability.

\text{logit}\left(p\right)=\ln\left(\frac{p}{1-p}\right)

# The relationship between probability and logits
tibble(p = seq(0, 1, .001),
       logits = qlogis(p)) %>%
  ggplot(aes(p, logits)) +
  geom_arrow(color = myfills[1],
             arrow_head = my_arrowhead, 
             arrow_fins = my_arrowhead) +
  scale_x_continuous("Probability",
                     breaks = seq(0, 1, .2),
                     labels = prob_label) +
  scale_y_continuous("Logits",
                     breaks = seq(-10, 10, 2),
                     labels = signs) +
  theme_minimal(base_size = 30, base_family = bfont)

13.1.2 W Scores

In the same way that we transform z-scores to various kinds of standard scores, we can transform logits to other kinds of scales. The most prominent in psychological assessment is the W-score (AKA Growth Score Values), developed by Woodcock & Dalh (1971). An accessible discussion of its derivation can by found in Benson et al. (2018).

Woodcock, R. W., & Dalh, M. N. (1971). A common scale for the measurement of person ability and item difficulty (AGS Paper No. 10). American Guidance Service.
Benson, N. F., Beaujean, A. A., Donohue, A., & Ward, E. (2018). W Scores: Background and derivation. Journal of Psychoeducational Assessment, 36(3), 273–277. https://doi.org/10.1177/0734282916677433

If ability \theta is in logits, then W is calculated like so:

W = \frac{20}{\ln(9)}\theta+500

Although the coefficient \frac{20}{\ln(9)} may seem strange, it has some desirable features. It is equivalent to 20 times the base-9 logarithm of e:

20 \log_9(e) = \frac{20}{\ln(9)} \approx 9.1024

Figure 13.3: The relationship between ability-difficulty differences and probability of answering correctly. Note that at −20, −10, 0, +10, and +20 the probabilities are exactly .10, .25, .50, .75, and .90, respectively.

This unusual coefficient yields some nice round probabilities of answering an item correctly when the person’s ability and the item’s difficulty differ by 0, 10, or 20 points (See Figure 13.3).

The probability p of answering an item correctly depends on the difference between the item’s difficulty W_D and the person’s ability W_A. Specifically, the relationship in Figure 13.3 works according to this equation:

p = \left(1 + 9^{\left(W_D-W_A\right)/20}\right)^{-1}

# The relationship between ability-difficulty differences and probability of answering correctly.
w2p <- function(w, difficulty) {
  (1 + 9 ^ ((difficulty - w) / 20))^(-1)
}

tibble(
  w = seq(445, 555),
  p = w2p(w, 500),
  slope = (p - lag(p, default = 0)) / (w - lag(w, default = 1)),
  theta = atan(slope * 100) + ifelse(p <= .5, pi, -pi) / 2
) %>%
  ggplot(aes(w, p)) +
  geom_arrow(arrow_head = my_arrowhead,
             arrow_fins = my_arrowhead, 
             color = myfills[1]) +
  geom_point(data = . %>% filter(w %in% seq(450, 550, 10)),
             color = myfills[1]) +
  geom_richtext(
    data = . %>%
      filter(w %in% seq(450, 550, 10)),
    aes(
      label = prob_label(p, digits = 2),
      vjust = WJSmisc::angle2vjust(theta),
      hjust = WJSmisc::angle2hjust(theta)
    ),
    label.color = NA,
    label.padding = unit(0, "mm"),
    label.margin = unit(0.5, "mm"),
    size = ggtext_size(18),
    color = myfills[1]
  ) +
  scale_x_continuous(
    "Ability – Difficulty (in W units)",
    breaks = seq(450, 550, 10), 
    expand = expansion(),
    labels = \(x) paste0(signs::signs(x - 500), if_else(x >= 500, "", "\u2007"))
  ) +
  scale_y_continuous("Probability of Answering Correctly",
                     breaks = seq(0, 1, .10),
                     expand = expansion(),
                     labels = prob_label) +
  theme_minimal(base_size = 22, base_family = bfont) +
  theme(axis.title.x = element_text()) +
  coord_fixed(100,
              xlim = c(447, 553),
              ylim = c(0, 1),
              clip = "off")

13.2 Relative Proficiency

One of the benefits of item response theory models is that ability is expressed on the same scale as item difficulties. This feature allows us to make predictions about how like a person with a particular ability level will correctly answer an item of a particular difficulty.

The Relative Proficiency Index (RPI) was first used in the Woodcock Reading Mastery Tests (Woodcock, 1973). The RPI answers the following question:

The Relative Proficiency Index (RPI) tells us the proability a person with of ability \theta will answer an item correctly given that a person with ability \mu_\theta has probability p of answering it correctly.
Woodcock, R. W. (1973). Woodcock Reading Mastery Tests. American Guidance Service.

When the average same-age peer with ability \mu_\theta has probability p of answering an item, what is the probability of answering correctly for a person of ability \theta?

The RPI can be calculated like so:

RPI = \left(1+e^{-\left(\theta-\mu_{\theta}+\text{logit}(p)\right)}\right)^{-1} \begin{align*} \theta &= \text{Ability level of person (in logits)}\\ \mu_{\theta} &= \text{Average ability level (in logits) of a reference group}\\ p &= \text{Probability a person with ability } \mu_\theta \text{ will answer correctly}\\ \text{logit}(p) &= \ln\left(\frac{p}{1-p}\right) = p \text{ converted to logits} \end{align*}

The psycheval package can calculate the RPI using the the rpi function. By default, the primary inputs x and mu are assumed to be on the W scale, and the criterion p is .90.

Suppose a person’s ability corresponds to a W-score of 460 and same-age peers have an average W score of 500:

library(psycheval)
rpi(x = 460, mu = 500)
0.1

This difference of 40 W score points means that when typical same-age peers have a 90% chance of answering an item correctly (a common benchmark for mastery), this person has a 10% chance of answering the item correctly.

If x and mu are logits, then you can specify scale = 1 like so:

rpi(x = 1, mu = 0, scale = 1)
0.9607297

The RPI works nicely for documenting deficits, but for gifted students, the RPI is quite high, often near 1. In such cases, we can also calculate the probability a person with ability \mu_\theta can answer an item that a person with ability \theta has a probability p_\theta of answering correctly:

RPI_{\text{reversed}} = \left(1+e^{-\left(\mu_\theta-\theta+\text{logit}(p_\theta)\right)}\right)^{-1}

Suppose a person has a W score of 550, which is 50 points higher than typical same-age peers. The standard RPI will give a value close to 1:

rpi(x = 550, mu = 500)
0.999543

This means that when same-age peers have a 90% chance of answering the item correctly, this person is almost certain to answer it correctly. Unfortunately, this fact does not convey the degree of giftedness in an evocative manner.

To get a better sense of how far advanced this person is compared to the performance of typical same-age peers, we can reverse the RPI like so.

rpi(x = 550, mu = 500, reverse = TRUE)
0.03571429

This means that when this person has a .9 probability of answering an item correctly, the typical same-age peer has about a .036 probability of answering it. Thus, this person is capable of completing tasks that are quite difficult for typical same-age peers.

The standard RPI refers to a proficiency level of .9, but the rpi function can calculate the relative proficiency index at any criterion level. For example:

rpi(x = 550, mu = 500, criterion = .10)
0.9642857

This means that when a typical same-age peer has a .10 probability of answering an item correctly, this person will answer it correctly with a .96 probability.

An Excel spreadsheet that calculates this “generalized” RPI can be found here.

13.3 Thresholds

The traditional threshold for diagnosing intellectual disability is 70. If a person’s observed IQ \left(r_{XX}=.97\right) is 68, what is the probability that the person’s true score is 70 or less?

More generally, given an observed score X with reliability coefficient r_{XX}, what is the probability that the associated true score T is less than or equal to threshold \tau?

When we predict the true score T with a specific value of X, the estimated true score \hat{T} is:

\hat{T}=r_{XX}\left(X-\mu_X\right)+\mu_X

The standard error of the estimate \sigma_{T-\hat{T}} in this prediction is:

\sigma_{T-\hat{T}}=\sigma_X\sqrt{r_{XX}-r_{XX}^2}

If we have reason to assume that the prediction errors are normally distributed, the probability that the true score T is less than or equal to threshold \tau can be calculated using the standard normal cumulative distribution function \Phi like so:

P(T\le\tau)=\Phi\left(\frac{\tau-\hat{T}}{\sigma_{T-\hat{T}}}\right)

Using this formula, we can see that our hypothetical person with IQ = 68, the probability that the true score is 70 or lower is about .66. Figure 13.4 shows the probabilities for all values near the diagnostic threshold of 70.

Figure 13.4: The Probability a True Score Will Be Less Than or Equal to 70
# The Probability a True Score Will Be Less Than or Equal to 70

# Reliability Coefficient
rxx <- 0.97

# Mean of X
mu <- 100

# SD of X
sigma <- 15

# Threshold tau
threshold <- 70

# Standard error of the estimate in the prediction of T
see <- sigma * sqrt(rxx - rxx ^ 2)

# Bounds for plot
x_min <- 60
x_max <- 80

# Aspect ratio for plot
plot_ratio <- x_max - x_min


# Data
d <- tibble(x = seq(x_min, x_max, .2),
       T_hat = rxx * (x - mu) + mu,
       p = pnorm((threshold - T_hat) / see),
       slope = plot_ratio * (p - lag(p)) / (x - lag(x)),
       text_angle = atan(slope) + ifelse(p > 0, pi / 2, pi / -2))

# Data for integer points
d_points <- d |> 
  filter(p > .005 & p < .995 & x %in% seq(x_min, x_max))

ggplot(d, aes(x, p)) +
  geom_line(color = myfills[1], 
            linewidth = 1) +
  geom_point(data = d_points, 
             color = myfills[1], 
             pch = 16, 
             size = 2) +
  geom_richtext(data = d_points, 
                aes(label = WJSmisc::prob_label(p), 
                    hjust = WJSmisc::angle2hjust(text_angle),
                    vjust = WJSmisc::angle2vjust(text_angle)), 
                label.padding = unit(0, "mm"), 
                label.color = NA,
                color = myfills[1],
                size = ggtext_size(16),
                label.margin = unit(0.5, "mm")) +
    geom_richtext(data = d_points, 
                aes(label = x, 
                    hjust = WJSmisc::angle2hjust(text_angle + pi),
                    vjust = WJSmisc::angle2vjust(text_angle + pi)), 
                label.padding = unit(0, "mm"), 
                label.color = NA,
                color = myfills[1],
                size = ggtext_size(16),
                label.margin = unit(0.5, "mm")) +
  annotate("rect", 
           xmin = -Inf, 
           ymin = -Inf, 
           xmax = threshold, 
           ymax = Inf, 
           alpha = .2, 
           fill = myfills[1]) +
  scale_x_continuous("Observed Score",
                     expand = expansion(add = 0),
                     breaks = seq(40, 160, 5),
                     minor_breaks = seq(40, 160, 1)) +
  scale_y_continuous(paste0("Probability the True Score &le; ",
                            threshold),
                     expand = expansion(add = 0.04),
                     breaks = seq(0,1,.1),
                     labels = WJSmisc::prob_label) + 
  labs(caption = paste0("*Note:* *<span style='font-family: serif'>&mu;</span>* = 100, *<span style='font-family: serif'>&sigma;</span>* = 15, *r<sub>XX</sub>* = ", prob_label(rxx))) +
  coord_fixed(ratio = plot_ratio, clip = "off") + 
  theme(plot.caption = element_markdown(family = bfont),
        axis.title.y = element_markdown(family = bfont))

As a shortcut to using the formulas displayed above

13.3.1 Multivariate Thresholds

To diagnose intellectual disability, we need a standardized measure of intellectual functioning (usually an IQ test) and well-validated measure of adaptive functioning. Suppose our two measures correlate at r = .40. The reliability coefficient of the IQ is r_{iq}=.97, and the reliability coefficient of the adaptive behavior composite is r_{ab}=.95. Both measures have a mean of \mu=100 and a standard deviation of \sigma=15.

Suppose that a person with IQ = 68 has an adaptive behavior composite of 67. What is the probability that both true scores are 70 or lower?

The vector of observed scores is:

X=\{68, 67\}

The vector of reliability coefficients:

r_{XX}=\{.97,.95\}

The correlation matrix is:

R_X=\begin{bmatrix} 1&.97\\ .97&1 \end{bmatrix}

The vector of means is:

\mu_X=\{100,100\}

The vector of standard deviations is:

\sigma_X=\{15,15\}

The observed covariance matrix is:

\Sigma_X=\sigma_X^\prime R_X\sigma_X

The true score covariance matrix is the same as the observed score covariance matrix except that the diagonal of \Sigma_X is multiplied by the vector of reliability coefficients r_{XX}:

\Sigma_T=\Sigma_X \circ \begin{bmatrix}.97&1\\1&.95\end{bmatrix}

The cross-covariances between X and T also equal \Sigma_T.

We can use equations from Eaton (2007, p. 116) to specify the conditional means and covariance matrix of the true scores, controlling for the observed scores:

Eaton, M. L. (2007). Multivariate statistics: A vector space approach. Inst. of Mathematical Statistics.

\begin{align} \mu_{T\mkern 2mu\mid X}&=\mu_X+\Sigma_T\Sigma_X^{-1}\left(X-\mu_X\right)\\ \Sigma_{T\mkern 2mu\mid X}&=\Sigma_T-\Sigma_T\Sigma_X^{-1}\Sigma_T^\prime \end{align}

We can imagine that the true scores conditioned on the observed scores are multivariate normal variates:

\left(T\mid X\right)\sim\mathcal{N}\left(\mu_{T\mkern 2mu\mid X}, \Sigma_{T\mkern 2mu\mid X}\right)

We can estimate the probability that both true scores are 70 or lower using the cumulative distribution function of the multivariate normal distribution with upper bounds of 70 for both IQ true scores and adaptive behavior true scores. Under the conditions specified previously, Figure 13.5 shows that the probability that both scores are 70 or lower is about .53.

Figure 13.5: If IQ = 68 and Adaptive Behavior = 67, what is the probability that both true scores are less than or equal to 70?
# If IQ = 68 and Adaptive Behavior = 67, 
# what is the probability that both true scores 
# are less than or equal to 70?

# Observed scores
X <- c(IQ = 68, AB = 67)

# Reliability coefficients
rxx <- c(IQ = .97, AB = .95)

# Correlation of IQ and AB
rxy <- .40

# Correlation matrix
R <- matrix(c(1, rxy, 
              rxy, 1), 
            ncol = 2)

# Means
mu_X <- c(IQ = 100, AB = 100)

# Standard deviations
s_X <- diag(c(IQ = 15, AB = 15))

# Covariance Matrix for X
sigma_X <- s_X %*% R %*% s_X

# Covariance Matrix for True Scores
sigma_T <- sigma_X
diag(sigma_T) <- diag(sigma_X) * rxx

# Conditional T means, given X
mu_T_X <- as.vector(mu_X + sigma_T %*% solve(sigma_X) %*% (X - mu_X))

# Conditional T covariance matrix, given X
sigma_T_X <- sigma_T - sigma_T %*% solve(sigma_X) %*% t(sigma_T)


# Upper thresholds
tau <- c(IQ = 70, AB = 70)

# Conditional probability T < tau given X
p <- mvtnorm::pmvnorm(upper = tau, mean = mu_T_X, sigma = sigma_T_X)


mvtnorm::rmvnorm(1000, mean = mu_T_X, sigma = sigma_T_X) |> 
  `colnames<-`(c("IQ", "AB")) |> 
  as_tibble() |> 
  mutate(truepositive = IQ < tau[1] & AB < tau[2]) |> 
  ggplot(aes(IQ, AB)) + 
  annotate("richtext", 
           x = 100, 
           y = 100, 
           label = "*r<sub>xy</sub>* = .40", 
           label.color = NA, 
           label.margin = unit(0,"mm"),
           size = ggtext_size(24)) + 
  annotate("rect", 
           xmin = -Inf, 
           ymin = -Inf, 
           xmax = tau[1], 
           ymax = tau[2], 
           fill = myfills[2], 
           alpha = .2) +
    geom_polygon(
    data = WJSmisc::cor_ellipse(
      rxy, 
      mean = mu_X, 
      sd = c(15,15)), 
    aes(x,y), 
    alpha = .1) +
  geom_polygon(
    data = WJSmisc::cor_ellipse(
      cov2cor(sigma_T_X)[2,1], 
      mean = mu_T_X, 
      sd = sqrt(diag(sigma_T_X))), 
    aes(x,y), 
    alpha = .1) +
  scale_x_continuous("IQ (*r<sub>xx</sub>* =.97)", breaks = seq(40, 160, 15), 
                     minor_breaks = seq(40, 160, 5), 
                     limits = c(40, 160), expand = expansion())  +
  scale_y_continuous("Adaptive Behavior Composite (*r<sub>yy</sub>* =.95)", 
                     breaks = seq(40, 160, 15), 
                     minor_breaks = seq(40, 160, 5), 
                     limits = c(40, 160), expand = expansion()) +
  geom_point(aes(color = truepositive), 
             size = 0.5, 
             pch = 16, 
             alpha = .4) + 
  scale_color_manual(values = myfills) + 
  coord_fixed(xlim = c(35,165), ylim = c(35,165)) + 
  theme(legend.position = "none", 
        axis.title.x = element_markdown(), 
        axis.title.y = element_markdown()) +
  annotate("point", x = X[1], y = X[2], size = 2) +
  annotate("text", x = 52.5, y = 50, 
           label = paste0("Given the observed\nscores, the probability\nboth true scores are\n 70 or below is ", 
                          WJSmisc::prob_label(p), "."),
           size = WJSmisc::ggtext_size(17)) +
  annotate("text",
           x = X[1],
           y = X[2],
           label = paste0("(", x = X[1], ",", y = X[2], ")"),
           hjust = 1.05,
           vjust = 1.1,
           size = WJSmisc::ggtext_size(24))

14 Multivariate Thresholds and True Scores

r_xx <- .96
v_name <- c("IQ", "IQ_true")
threshold <- 70
sigma <- matrix(c(1,rep(r_xx,3)), nrow = 2, dimnames = list(v_name,v_name)) * 15^2
mu <- c(IQ = 100, IQ_true = 100)

p <- tibble(lower = list(c(IQ = -Inf, IQ_true = -Inf),
                    c(IQ = -Inf, IQ_true = threshold),
                    c(IQ = threshold, IQ_true = -Inf),
                    c(IQ = threshold, IQ_true = threshold),
                    c(IQ =  -Inf, IQ_true =  -Inf),
                    c(IQ =  threshold, IQ_true =  -Inf),
                    c(IQ =  -Inf, IQ_true =  -Inf),
                    c(IQ =  -Inf, IQ_true =  threshold)),
       upper = list(c(IQ = threshold, IQ_true = threshold),
                    c(IQ = threshold, IQ_true = Inf),
                    c(IQ = Inf, IQ_true = threshold),
                    c(IQ = Inf, IQ_true = Inf),
                    c(IQ = threshold, IQ_true = Inf),
                    c(IQ = Inf, IQ_true = Inf),
                    c(IQ = Inf, IQ_true = threshold),
                    c(IQ = Inf, IQ_true = Inf)),
       outcome = c("TP", "FP", "FN", "TN", "P", "N", "D+", "D-"),
       p = pmap_dbl(list(lower = lower, upper = upper), \(lower, upper) mvtnorm::pmvnorm(lower, upper, mean = mu, sigma = sigma) |> as.vector())) |> 
  select(outcome,p) |> 
  deframe()
p
         TP          FP          FN          TN           P           N 
0.017460927 0.005289205 0.003152489 0.974097379 0.022750132 0.977249868 
         D+          D- 
0.020613417 0.979386583 
tibble(Statistic = c("Sensitivity", 
                     "Specificity",
                     "PPV",
                     "NPV",
                     "Overall Accuracy",
                     "Prevalence",
                     "Selection Ratio",
                     "Positive Likelihood Ratio",
                     "Negative Likelihood Ratio",
                     "True Positive Rate",
                     "False Positive Rate",
                     "True Negative Rate",
                     "False Negative Rate"),
       Value = c(p["TP"] / p["D+"],
                 p["TN"] / p["D-"],
                 p["TP"] / p["P"],
                 p["TN"] / p["N"],
                 p["TP"] + p["TN"],
                 p["D+"],
                 p["P"],
                 (p["TP"] / p["D+"]) / (p["FP"] / p["D-"]),
                 (p["FN"] / p["D+"]) / (p["TN"] / p["D-"]),
                 p["TP"],
                 p["FP"],
                 p["TN"],
                 p["FN"]
                 )) 
# A tibble: 13 × 2
   Statistic                     Value
   <chr>                         <dbl>
 1 Sensitivity                 0.847  
 2 Specificity                 0.995  
 3 PPV                         0.768  
 4 NPV                         0.997  
 5 Overall Accuracy            0.992  
 6 Prevalence                  0.0206 
 7 Selection Ratio             0.0228 
 8 Positive Likelihood Ratio 157.     
 9 Negative Likelihood Ratio   0.154  
10 True Positive Rate          0.0175 
11 False Positive Rate         0.00529
12 True Negative Rate          0.974  
13 False Negative Rate         0.00315
tibble(statistic = c("Sensitivity", 
                     "Specificity", 
                     "PPV", 
                     "NPV"),
       lowerx = list(c(IQ = -Inf, IQ_true = -Inf),
                     c(IQ = threshold, IQ_true = -Inf),
                     c(IQ = -Inf, IQ_true = -Inf),
                     c(IQ = -Inf, IQ_true = threshold)),
       upperx = list(c(IQ = threshold, IQ_true = Inf),
                     c(IQ = Inf, IQ_true = Inf),
                     c(IQ = Inf, IQ_true = threshold),
                     c(IQ = Inf, IQ_true = Inf)),
       lower = list(c(IQ = -Inf, IQ_true = -Inf),
                    c(IQ = -Inf, IQ_true = threshold),
                    c(IQ = -Inf, IQ_true = -Inf),
                    c(IQ = threshold, IQ_true = -Inf)),
       upper = list(c(IQ = Inf, IQ_true = threshold),
                    c(IQ = Inf, IQ_true = Inf),
                    c(IQ = threshold, IQ_true = Inf),
                    c(IQ = Inf, IQ_true = Inf)),
       value = pmap_dbl(list(lowerx = lowerx,
                             upperx = upperx,
                             lower = lower,
                             upper = upper), tmvtnorm::ptmvnorm, 
                        mean = mu, 
                        sigma = sigma)) |> 
  select(statistic, value)
# A tibble: 4 × 2
  statistic   value
  <chr>       <dbl>
1 Sensitivity 0.847
2 Specificity 0.995
3 PPV         0.768
4 NPV         0.997
list(
  list(
    statistic = "Sensitivity",
    lowerx = c(IQ = -Inf, IQ_true = -Inf),
    upperx = c(IQ = threshold, IQ_true = Inf),
    lower = c(IQ = -Inf, IQ_true = -Inf),
    upper = c(IQ = Inf, IQ_true = threshold)
  ),
  list(
    statistic = "Specificity",
    lowerx = c(IQ = threshold, IQ_true = -Inf),
    upperx = c(IQ = Inf, IQ_true = Inf),
    lower = c(IQ = -Inf, IQ_true = threshold),
    upper = c(IQ = Inf, IQ_true = Inf)
  ),
  list(
    statistic = "PPV",
    lowerx = c(IQ = -Inf, IQ_true = -Inf),
    upperx = c(IQ = Inf, IQ_true = threshold),
    lower = c(IQ = -Inf, IQ_true = -Inf),
    upper = c(IQ = threshold, IQ_true = Inf)
  ),
  list(
    statistic = "NPV",
    lowerx = c(IQ = -Inf, IQ_true = threshold),
    upperx = c(IQ = Inf, IQ_true = Inf),
    lower = c(IQ = threshold, IQ_true = -Inf),
    upper = c(IQ = Inf, IQ_true = Inf)
  ),
  list(
    statistic = "True Positives",
    lowerx = c(IQ = -Inf, IQ_true = -Inf),
    upperx = c(IQ = threshold, IQ_true = threshold),
    lower = c(IQ = -Inf, IQ_true = -Inf),
    upper = c(IQ = Inf, IQ_true = Inf)
  ),
  list(
    statistic = "False Positives",
    lowerx = c(IQ = -Inf, IQ_true = threshold),
    upperx = c(IQ = threshold, IQ_true = Inf),
    lower = c(IQ = -Inf, IQ_true = -Inf),
    upper = c(IQ = Inf, IQ_true = Inf)
  ),
  list(
    statistic = "True Negatives",
    lowerx = c(IQ = threshold, IQ_true = threshold),
    upperx = c(IQ = Inf, IQ_true = Inf),
    lower = c(IQ = -Inf, IQ_true = -Inf),
    upper = c(IQ = Inf, IQ_true = Inf)
  ),
  list(
    statistic = "False Negatives",
    lowerx = c(IQ = threshold, IQ_true = -Inf),
    upperx = c(IQ = Inf, IQ_true = threshold),
    lower = c(IQ = -Inf, IQ_true = -Inf),
    upper = c(IQ = Inf, IQ_true = Inf)
  )
  ) |>
  map(\(l) {
    tmvtnorm::ptmvnorm(
      lowerx = l$lowerx,
      upperx = l$upperx,
      lower = l$lower,
      upper = l$upper,
      mean = mu,
      sigma = sigma
    ) |>
      as.vector() |>
      `names<-`(l$statistic)
  }) |>
  unlist() |> 
  enframe()
# A tibble: 8 × 2
  name              value
  <chr>             <dbl>
1 Sensitivity     0.847  
2 Specificity     0.995  
3 PPV             0.768  
4 NPV             0.997  
5 True Positives  0.0175 
6 False Positives 0.00529
7 True Negatives  0.974  
8 False Negatives 0.00315

14.1 Structure

Exploratory factor analysis

Confirmatory factor analysis

Structural equation modeling

14.2 Reliability

Retest reliability

Alternate-form reliability

Split-half reliability

Internal consistency

  • Cronbach’s Alpha
  • McDonald’s Omega

Conditional reliability (IRT)

14.3 Validity

Face validity

Content validity

Criterion-oriented validity

  • Concurrent validity
  • Predictive validity

Discriminant validity

Convergent validity

Incremental validity

Construct validity

14.4 Profiles

Difference scores

Outliers

Highest-lowest scores

Multivariate profile shape

Conditional profiles