IRTest

R-CMD-check CRAN status

IRTest can be a useful tool for IRT (item response theory) parameter estimation, especially when the violation of normality assumption on latent distribution is suspected.
IRTest deals with uni-dimensional latent variable.
In IRTest, along with the conventional approach that assumes normality on latent distribution, several methods can be applied for estimation of latent distribution:
+ empirical histogram method,
+ two-component Gaussian mixture distribution,
+ Davidian curve,
+ kernel density estimation.

Installation

You can install IRTest on R-console with:

install.packages("IRTest")

Functions

Followings are functions of IRTest available for users.

Example

A simulation study for a Rasch model can be done in following manners:

library(IRTest)
Alldata <- DataGeneration(seed = 123456789,
                          model_D = rep(1, 20),
                          N=1000,
                          nitem_D = 20,
                          nitem_P = 0,
                          d = 1.664,
                          sd_ratio = 2,
                          prob = 0.3)

data <- Alldata$data_D
item <- Alldata$item_D
initialitem <- Alldata$initialitem_D
theta <- Alldata$theta

For an illustrative purpose, empirical histogram method is used for the estimation of latent distribution.

Mod1 <- IRTest_Dich(initialitem = initialitem,
                    data = data,
                    model = rep("1PL", 20),
                    latent_dist = "EHM",
                    threshold = .001
                    )
### True item parameters 
colnames(item) <- c("a", "b", "c")
knitr::kable(item, format='simple', caption = "True item parameters")
a b c
1 -0.96 0
1 1.04 0
1 0.47 0
1 -0.16 0
1 -0.81 0
1 -0.40 0
1 0.82 0
1 -0.37 0
1 -1.11 0
1 0.50 0
1 -0.97 0
1 -1.05 0
1 0.02 0
1 1.32 0
1 -0.50 0
1 0.18 0
1 -1.39 0
1 0.59 0
1 -0.58 0
1 -1.59 0

True item parameters


### Estimated item parameters
knitr::kable(Mod1$par_est, format='simple', caption = "Estimated item parameters")
a b c
1 -0.8177894 0
1 0.9514716 0
1 0.4703169 0
1 -0.0574434 0
1 -0.8503595 0
1 -0.4316589 0
1 0.8852200 0
1 -0.3157931 0
1 -1.1680628 0
1 0.5366363 0
1 -1.0744048 0
1 -1.1621301 0
1 0.0709861 0
1 1.2536640 0
1 -0.4265914 0
1 0.2046360 0
1 -1.3770776 0
1 0.5984116 0
1 -0.7533302 0
1 -1.6965297 0

Estimated item parameters



### Plotting
par(mfrow=c(1,2))
plot(item[,2], Mod1$par_est[,2], xlab = "true", ylab = "estimated", main = "item parameters")
abline(a=0,b=1)
plot(theta, Mod1$theta, xlab = "true", ylab = "estimated", main = "ability parameters")
abline(a=0,b=1)

plot_LD(Mod1)+
  geom_line(mapping = aes(colour="Estimated"))+
  geom_line(mapping=aes(x=seq(-6,6,length=121), 
                        y=dist2(seq(-6,6,length=121),prob = .3, d=1.664, sd_ratio = 2), 
                        colour="True"))+
  labs(title="The estimated latent density using 'EHM'", colour= "Type")+
  theme_bw()

Each examinee’s posterior distribution is identified in the E-step of the estimation algorithm (i.e., EM algorithm). Posterior distributions can be found in Mod1$Pk.

set.seed(1)
selected_examinees <- sample(1:1000,6)
post_sample <- data.frame(X=rep(seq(-6,6, length.out=121),6), 
                          posterior = 10*c(t(Mod1$Pk[selected_examinees,])), 
                          ID=rep(paste("examinee", selected_examinees), each=121))
ggplot(data=post_sample, mapping=aes(x=X, y=posterior, group=ID))+
  geom_line()+
  labs(title="Posterior densities for selected examinees", x=expression(theta))+
  facet_wrap(~ID, ncol=2)+
  annotate(geom="line", x=seq(-6,6,length=121), 
                        y=dist2(seq(-6,6,length=121),prob = .3, d=1.664, sd_ratio = 2), colour="grey")+
  theme_bw()