#### Termes les plus recherchés

# [PDF](+19👁️) Télécharger ERIC EJ1109318: Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20 pdf

#### Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single location indices for ordinal polytomous items are proposed and studied. The proposed location indices (LIs) for polytomous items are mathematically derived based on the item category response functions (ICRFs) and item response function (IRF) for polytomous items. The ICRF approach resulted in three indices: LI[subscript mean], LI[subscript trimmed mean], and LI[subscript median], and the IRF approach resulted in one proposed index, LI[subscript IRF]. An empirical example of real items is presented to help comprehension of the new location indices. Possible testing applications where the proposed item location indices are useful are discussed.Télécharger gratuit ERIC EJ1109318: Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20 pdf

Listening. Learning. Leading:

Location Indices for Ordinal Polytomous

Items Based on Item Response Theory

Usama S. AM

Hua-Hua Chang

Carolyn J. Anderson

\©S

V*>'

&

December 2015

ETS Research Report Series

EIGNOR EXECUTIVE EDITOR

James Carlson

Principal Psychometrician

ASSOCIATE EDITORS

Beata Beigman Klebanov

Senior Research Scientist - NLP

Heather Buzick

Research Scientist

Brent Bridgeman

Distinguished Presidential Appointee

Keelan Evanini

Senior Research Scientist - NLP

Marna Golub-Smith

Principal Psychometrician

Shelby Haberman

Distinguished Presidential Appointee

Donald Powers

Managing Principal Research Scientist

Gautam Puhan

Principal Psychometrician

John Sabatini

Managing Principal Research Scientist

Matthias von Davier

Senior Research Director

Rebecca Zwick

Distinguished Presidential Appointee

PRODUCTION EDITORS

Kim Fryer Ayleen Stellhorn

Manager, Editing Services Editor

Since its 1947 founding, ETS has conducted and disseminated scientific research to support its products and services, and

to advance the measurement and education fields. In keeping with these goals, ETS is committed to making its research

freely available to the professional community and to the general public. Published accounts of ETS research, including

papers in the ETS Research Report series, undergo a formal peer-review process by ETS staff to ensure that they meet

established scientific and professional standards. All such ETS-conducted peer reviews are in addition to any reviews that

outside organizations may provide as part of their own publication processes. Peer review notwithstanding, the positions

expressed in the ETS Research Report series and other published accounts of ETS research are those of the authors and

not necessarily those of the Officers and Trustees of Educational Testing Service.

The Daniel Eignor Editorship is named in honor of Dr. Daniel R. Eignor, who from 2001 until 2011 served the Research and

Development division as Editor for the ETS Research Report series. The Eignor Editorship has been created to recognize

the pivotal leadership role that Dr. Eignor played in the research publication process at ETS.

ETS Research Report Series ISSN 2330-8516

RESEARCH REPORT

Location Indices for Ordinal Polytomous Items Based on

Item Response Theory

Usama S. AN , 1 Hua-Hua Chang , 2 & Carolyn J. Anderson 2

1 Educational Testing Service, Princeton, NJ

2 University of Illinois at Urbana-Champaign, Champaign, IL

Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index

is needed to describe an item’s location along a latent trait continuum. Situations in which a single index would be needed include

item selection in computerized adaptive testing or test assembly. Therefore single location indices for ordinal polytomous items are

proposed and studied. The proposed location indices (Lis) for polytomous items are mathematically derived based on the item category

response functions (ICRFs) and item response function (IRF) for polytomous items. The ICRF approach resulted in three indices: LI mean ,

LItrimmedmean’ and LI median , and the IRF approach resulted in one proposed index, LI IRF . An empirical example of real items is presented

to help comprehension of the new location indices. Possible testing applications where the proposed item location indices are useful

are discussed.

Keywords graded-response model; item response function; item response theory; location index; partial credit models; polytomous

items

doi: 10.1002/ets2.12065

Introduction

Items are the building blocks of psychological and educational tests, and the characteristics of the items determine the

properties of the test. Item response theory (IRT) models specify the characteristics of items by parameters that are esti¬

mated from observed responses to items. These parameters act as descriptive statistics for the items. Researchers often

use polytomous items for a variety of reasons, but mainly because these formats are more informative and reliable than

dichotomously scored items. IRT models for polytomous items include the graded-response model (GRM; Samejima,

1969), partial credit model (PCM; Masters, 1982), generalized partial credit model (GPCM; Muraki, 1992), and nomi¬

nal response model (Bock, 1972). In IRT models for dichotomous items, a single parameter conveys an item’s difficulty;

however, with a few exceptions, most of the previously mentioned polytomous models do not have a location parameter

or location index associated with their standard definition. This presents a problem for using polytomous items in test

assembly, in which typically a single index is required to determine what items to put on a test. Therefore this report dis¬

cusses several variants of location indices that can be used with polytomous models. For a number of IRT Rasch models

for polytomous items, item locations are well defined, such as the rating scale and successive interval models (see, e.g.,

Andrich, 1978,1982; Rost, 1988).

Let X, = 0,1, ... ,m be the scores for ordinal, polytomous item i with corresponding probabilities

given a latent trait value 9. The probabilities, P jx {9), for different levels of 9 are considered the item category response

functions (ICRFs) where particular IRT models specify a (logistic) regression with a location or intercept parameter for

response options as well as a slope for 9. Because the ICRFs have at least m parameters gauging difficulty and summa¬

rizing or describing the location of the item on the latent trait, defining an index that represents an item’s location along

the underlying continuum is not straightforward. As explained in later sections, the slope parameter for 9, a, does not

represent the discrimination of a polytomous item as it does in IRT models for dichotomous items. The slope parameter,

a t , combined with other category parameters defines the polytomous item discrimination (Embretson & Reise, 2000). The

structure of polytomous IRT models leads to challenges in interpreting these parameters. For a dichotomous item, a single

curve, an item characteristic curve or item response function (IRF), depicts the relationship between the latent variable

Corresponding author: U. S. Ali, E-mail: uali@ets.org

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

1

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Ability

Figure 1 Item category response functions for a five-category item.

and the probability of correct response; however, for an item with (m + 1) response options, there exist (m + 1) ICRFs,

one for each response category. Therefore it is essential to deal with (m+ 1) curves to extract the information about item

properties. As an example of the complexity, Figure 1 presents five ICRFs of an item with five response options.

For polytomous items, there is a distinction between an ICRF and an IRF. An ICRF gives the probability for a response

option, whereas an IRF gives the expected value of X l as a weight sum of the ICRFs; that is,

m

E (X ; ) = P n (9) + 2P i2 (6) + • • • + mP im (9) = £ xP ix (9 ). (1)

X=1

Chang and Mazzeo (1994) addressed the issue about the correspondence between an IRF and sets of ICRFs for a

polytomous item. This correspondence is automatically satisfied for all dichotomous models for which the correct and

incorrect response curves for a dichotomous item (i.e., two ICRFs) can be summarized by one curve (i.e., when m = 1 in

Equation 1, the IRF becomes equivalent to the correct response curve) that carries all information about item properties,

as shown in Equation 2:

Pn ( 0 ) = E (X,). (2)

According to Chang and Mazzeo, the item structure of a polytomous item is uniquely determined by its IRF for most

commonly used models, and therefore the shape of the IRF contains all the information about the item, and no information

will be lost by studying only this single curve. However, their study neglected quantifying the location of the IRF. More

specifically, if an IRF uniquely determines the item structure, it is important to identify a location parameter for the IRF.

The current report is a continuation on such an effort to propose a single index for polytomous items based on polytomous

IRT models.

To summarize, the motivation and rationale to search for a central or an overall location parameter is twofold: (a) the

complexity of multiple and different parameterizations for a polytomous item even for the same model and (b) the lack

of a global item location parameter, which prevents the use of polytomous items in many testing applications, such as

the usage of certain item selection methods in adaptive testing where providing such methods in a polytomous case is

a challenge. The difference between the dichotomous and polytomous items in terms of parameters is the basis of the

current report. New item indices for polytomous items are defined using the properties of an item’s IRF and ICRFs. To

provide a basis for the new measures, an overview is given of the different but most commonly used model parameteriza¬

tions for polytomous items. These polytomous item response models are designed to analyze items with ordered response

options.

The remainder of this report is structured as follows. In the next section, we briefly review the IRT models for

which global location indices are developed (i.e., the GRM; Samejima, 1969) and the partial credit models (hereinafter

2

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

referring to both PCM, Masters, 1982, and GPCM, Muraki, 1992). After reviewing these models, global item loca¬

tion indices are proposed, and their characteristics are studied. An extended example using items from the National

Assessment of Educational Progress is presented, followed by a discussion of the prospects for the use of the new

indices.

Polytomous Item Response Models

The GRM (Samejima, 1969) and the partial credit models (Masters, 1982; Muraki, 1992) share a main characteristic: They

have the same discrimination for each of the response options for an item. As noted in the following, the models differ in

other aspects.

Graded Response Model

Samejima (1969) proposed a model for items that are characterized as ordinal response categories (e.g., Tikert-scale items).

The GRM expresses the cumulative probability of getting at least a score x:

P* (X>x) =

p*

exp (a,- (9 - b ix ))

1 +exp (a, (6 -b ix ))’

(3)

where x= 1,2, ... ,m, a t is the item slope parameter, and b ix is a threshold parameter representing the point along the

6 scale at which examinees have .50 probability of responding in or above category x, and therefore the probability of

responding in a specific category score is P ix = P* x — P* x+1 • Because the probability of a response for a specific category

is the difference between the cumulative probabilities of two adjacent scores, Samejima’s GRM is considered a difference

model in Thissen and Steinberg’s (1986) classification of IRT models. Note that it is assumed in the GRM that P* = 1 and

P* i = 0 and that the b iv s are ordered such as b n <b i2 < ... < b im .

Partial Credit Models

Two versions of PCMs are considered here: Masters’s (1982) PCM and Muraki’s (1992) GPCM. Masters’s PCM is based

on a different conceptualization than the GRM and hence has a different parameterization. The PCM is suited to model

sums of binary responses that are not supposed to be stochastically independent (Verhelst & Verstralen, 2008), and it

is considered an extension of the dichotomous Rasch model to the polytomous case. The PCM belongs to the adjacent-

category models in Mellenbergh’s (1995) classification of IRT models and to the divide-by-total models in Thissen and

Steinberg’s (1986) classification. As a model in the Rasch family, the PCM considers items to be equally discriminating

(i.e., they all have the same slope, a { = a). Allowing items to be differentially discriminating yields the GPCM (Muraki,

1992,1993).

The GPCM for an examinee with ability 6 states that the probability of getting a score x of item i denoted by P(6) is

exp £ Da t (0 - b lv )

P * W = --’ < 4 >

1 + X eX P Z Dfl ‘ ( 6 ~ b iv)

C— 1 V=1

where D is a scaling constant that puts the trait scale in the same metric as the normal ogive model (D = 1.7) or stays on

the metric of the logistic model (D = 1) that will be used after this position, a, is a slope parameter for item i, and b jv are m

threshold parameters. Special cases of the GPCM include Birnbaum’s (1968) two-parameter logistic model when m = 1,

the PCM when a , = a, and the Rasch model for dichotomous items when m = 1 and a l = a. It should be noted that in the

PCMs, the thresholds need not to be ordered. Note that in the parameterization of the GPCM implemented in PARSCALE

(Muraki & Bock, 2003), there is a location b { and threshold distances d iv = b t — b iv that add up to 0. For example, for a

three-category item, there is one location parameter and two threshold distances, where d a = — d i2 .

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

3

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Figure 2 Item category response functions for a three-category item.

. P_0

- P_1

-P_2

Proposed Global Item Location Indices

Two general approaches are used to develop item location indices for polytomous items. The first approach is to study the

category response functions, and it will consider only the PCMs but not the GRM. The second one focuses on the IRF

for all models. Basically, the proposed indices are based on the ICRFs and IRF of a polytomous item using the preceding

models (Ali, 2011).

Indices Based on the Item Category Response Function

Using a partial credit model (i.e., PCM or GPCM), we developed indices for a polytomous item with three categories and

subsequently generalized the indices to items with more response options. Consider a three-category item that follows

the partial credit models (i.e., PCM and GPCM). The parameters for item i are a t , b iv and b i2 . There are three ICRFs

for the three possible scores 0, 1, or 2 on the item; that is, P i0 (9), Pa(9), and P i2 {9). Examples of three such ICRFs for

a three-response option item based on a partial credit model are shown in Figure 2. Note that the ICRFs intersect with

each other at some points; the zero- and perfect-score ICRFs intersect in a point corresponding to the peak of the partial

credit ICRF. Thus the middle score ICRF plays a key role in the development of an item’s location index. It can be shown

mathematically that this will always occur. Using the probability of attaining the partial credit (middle) score, P a (6), we

locate the peak of this ICRF. From Equation 4,

exp fa- (0 - b n )]

p (0) = _ L 1 v iI/J _ > (5)

1 + exp [a,- (9 - b a )] + exp [a,- (29 - b n - b i2 )] ’

and the first derivative of Equation 5 with respect to 9 is

dP ,i (9)

39

2

=a t P n (9) - a t P n (9)J j cP lc (9)

C= 1

--a,P n (9)

I>P, c (fl)

( 6 )

Setting Equation 6 equal to zero, we find that the maximum of Equation 6 occurs when any of the following three

conditions hold: a { = 0, P ;1 (d) = 0, or £" =1 cP k (9) = 1 (or £(X ; ) = 1). The first condition, a ; = 0, indicates that the item

has no discrimination power, and hence it would not be in an operational item pool. The second condition, P ;1 (0) = 0, is

not achievable. All response options have nonzero probability. The third condition, £" =1 cP k (9) = 1 or E(X,) = 1, can be

4

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

attained as seen by noting that

2

2> ic (0) = i

c= 1

Pn m + ip t2 (6) = i

exp [a ; (6 - b a )] + 2 exp [a t (26 - b n - b a )] = 1 + exp \a { (0 - b n )\ + exp \a { (26 - b a - b i2 )]

exp [a t (26 - b n -b a )]= 1

<h (20 - b a - b l2 ) = 0

0 = \ fai + 0(2) • (7)

With regard to the point on the ability continuum corresponding to the intersection between these two ICRFs of scores

0 and 2, it satisfies the following condition:

P io (0)=P i2 (0)

_1_ = _ exp [a,- (26 - b a - b i2 )\ _

1 + exp [a,- (6 - b n )\ + exp [a, (26 - b n - b i2 )] 1 + exp [a,- (0 - b a )] + exp [a, (26 - b a - b i2 )]

The equivalence in Equation 8 implies that

exp [flj (26 - b n ~b i2 )]= l, (9)

which is the same conclusion as given in Equation 7; that is, 6 = ( (b a + b j2 ).

Because we verified that the two ICRFs for the lowest and highest scores on the item intersect, we note that this corre¬

sponds to the same point on the ability scale as the maximum of the ICRF of the partial credit score (see Figure 2 for an

example of a three-category item).

For the more general case of a polytomous item with m + 1 categories, where every two ICRFs of scores x and

m — x intersect in a point (i.e., P j0 (6) = P jm {6)), we have the following conditions: (P ;1 ( 6 ) = P ;>m _, ( 6 ), ... ,P ix {6 ) =

Pi m - X (6), ... ,P m+i (6) = P m ±3 ( 6 )). See Figure 1 for an example of a five-category item. Following the same logic

used to obtain the result for the three-category item, we start with two curves at their point of intersection; that is,

P,A8)

Substituting the model for each of the functions and simplifying yields

x m—x

X ° ~ Yl b i‘ = (m “ 6 ~ Z 0“

c= 1 c= 1

m—x x

[(m - x) - x] 6 = Yj b k - ^b ic ,

C—\ C— 1

m—x

(m — 2x) 6 = ^ b ic .

C=X +1

This last relation suggests that a reasonable location index is based on

m—x

= --- V b: X = 0, 1, ... ,

m — 2x

m + 1

C=X +1

( 10 )

( 11 )

In other words, at the two middle ICRFs, 6 = ( m _i/ 2 )- When m is an even integer, as represented by the five-category

item example in Figure 1, such that there is one middle ICRF representing the score of m/2, we need a point that corre¬

sponds to the maximum of this ICRF.

To conclude. Table 1 summarizes the relationship of category characteristic curves of a polytomous item with ordered

response options scored 0 to m and the formula of the corresponding intersection points on the ability scale with reference

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

5

U. S. Ali ef at.

IRT Location Indices for Ordinal Polytomous Items

Table 1 Studied Item Category Response Functions and Corresponding Intersection Points

ICRFs

Intersection point

Notes

C 0 , C 1

K

Model definition

Cl. C 2

b 2

Model definition

C*_i> C x

K

Model definition

C 0 , C 2

0.5(fcj + fe 2 )

The same as the peak of C 1 for a three-category item

Cx’ C m _x

(m - 2x) 1 ^ b ic

C=X+ 1

m—y

General form of intersecting point of x and m — x score curves

C X ’C y

( m-x-yY 1 Yj b ic

C=X+ 1

m

More general form of intersecting point of any two curves

Co. C m

m-'Yh

C= 1

General form for the intersecting point of 0 and m score curves

Note. C v = item characteristic curve for score v.

to the definition of such scale values. This overall summary of the relations among ICRFs suggests the following proposed

location indices. On the basis of the preceding mathematical derivation, we propose alternative forms of a location index

(LI) for a polytomous item.

Proposal 1

The first form of an LI is the average item category difficulties, which takes all ICRFs into account (LI mean ) by

substituting x = 0 into Equation 11:

m

LImean = 1 JX (12)

Note that this proposed index is very similar to item location of Andrich’s rating scale model and it is also the

location parameter b generated by PARSCALE (Muraki & Bock, 2003).

Proposal 2

The second form of LI is the median of item category difficulties (LI median ), which is a possible choice in statistics, as

follows:

LImedian = Mediants)

(b (k) + b^) ,

y IX IX* J

if m is even,

if m is odd,

(13)

where is the threshold parameter that has the kth rank among the thresholds of the zth item and has score x,

b^ 1 ’ is the threshold parameter that has the (k + l)th, and

Proposal 3

The third form of an LI is the truncated (trimmed or Windsor) mean; that is, the average of item category difficulties

that takes all ICRFs into account, except the zero- and perfect-score curves if there are no reversals (LI trimmedmean ), by

substituting x = 1 into Equation 11:

^ m— 1

^trimmedmean ^ (14)

m — 2 “

c=2

In the case of reversal, the threshold parameters should be rank ordered in a similar way as in the previous index.

Index Based on the Item Response Function

The ease of calculating a polytomous IRF follows from the fact that an IRF can be thought of as describing the rate of

change of expected value of an item response as a function of the change in 6 relative to an item’s location b t (Nering &

6

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Ostini, 2006). More succinctly, this can be thought as a regression of the item score onto the trait ability (Chang & Mazzeo,

1994; Lord, 1980).

The previous three proposals of LI are based on the ICRFs; hence they are considered as local indices by the nature of

information gained from curves of specific score categories. Conversely, this is not the case in polytomous models. Chang

and Mazzeo (1994) showed that the IRF for a polytomously scored item is defined as a weighted sum of the ICRFs (the

probability of getting a particular score for a randomly sampled examinee of ability); that is, it is defined by the expected

value or mean of the scores.

The IRF, as defined in Equation 1, ranges from 0 to m (i.e., the maximum possible score category of an item). They

established the correspondence between an IRF and a unique set of ICRFs for two of the most commonly used polytomous

IRT models (GRM and the partial credit models). Specifically, Chang and Mazzeo provided a proof for these models, as

follows:

If two items have the same IRF, then they must have the same number of categories; moreover, they must consist of

the same ICRFs.

The condition on which the proof depends is that the discrimination parameter for each item does not depend on the

category (i.e., for a given item, the a parameter is the same for each category or response option). The GRM, PCM, and

GPCM all satisfy this condition, but the nominal response model does not satisfy it.

Along the same lines, Akkermans and Muraki (1997) introduced an IRF defined as a normalized expected score (i.e.,

the weighted sum of ICRFs divided by the number of item categories) that ranges from 0 to 1. Akkermans and Muraki’s

IRF differs in terms of the range from that introduced by Chang and Mazzeo (1994). Akkermans and Muraki (1997)

introduced the gradient (i.e., first derivative) of IRF as an item discrimination function, G(0),

G; ( 0 )

dT t m

36

m

( m V

a 2

i

Y J x2p l A0)-

x=l

v*=i /

1 ,( 0 )

a.

(15)

where T i (6) is the IRF, where T,(0) is called the scoring function (Andrich, 1978), and 7,(0) is the item information

(for the specific formulas for different models, see Dodd, de Ayala, & Koch, 1995; Muraki, 1993; Nering & Ostini,

2010 ).

The polytomous IRF has various merits. First, the IRF carries the full information of the item and encompasses the

partial amount of information included in ICRFs. Second, it is valid to apply the expected score to the most commonly

used ordinal response models (i.e., GRM, PCM, and GPCM). Third, the IRF is well connected to Fisher information (see

Equation 15). For the preceding properties of the IRF or expected score of a polytomous item, it is worthwhile to use it to

propose a central location parameter.

Proposal 4

The fourth form of LI is derived from the polytomous IRF. Binary IRT models such as one-, two-, or three-parameter

logistic models have an important feature in that the conditional mean of the item score (i.e., expectation) is the same

as the probability of answering the item correctly. Note that the dichotomous IRF uses the value of .05 (if there is no

guessing) as a threshold to determine the item location where the highest score is 1. Using the same analogy, the

index of a polytomous IRF corresponds to an expected score of 0.5m, where m is the highest possible score for an

(m + 1)-response category item. Because this value has a global nature in that it considers the IRF, we call it LI IRF :

LI irf = 9 : E [X,.] = (16)

For example, the 9 point that corresponds to the 0.5m under the partial credit models can be obtained through the

following equation, in which the closed-form solution is complicated to produce

(2x - m) exp ( a, I x6 —

2

Y— 1

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

7

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

An iterative algorithm is used to obtain the LI IRF for each polytomous item. Here are the details of the Newton - Raphson

method to obtain the approximate value of LI 1RF for both the partial credit models and the GRM.

The Newton - Raphson method is a numerical method to solve nonlinear equations of the form oif(x) = 0. The approx¬

imate solution to the equation is

_ /(*t)

/'(*)’

(18)

where x t +F is the updated approximation based on the previous estimate, x t , andf'(x t ) is the first derivative oif(x t ) with

respect to x.

In the following section, we introduce the approximation of LI IRF using the three different polytomous IRT models:

the partial credit models (Masters, 1982; Muraki, 1992) and the GRM (Samejima, 1969).

LI| RF of the Partial Credit Models' Items

In the current case, consider a polytomous item with m + 1 response categories ranging from 0 to m. The formula for

obtaining a category x on item i using a general form to the partial credit models is given by Equation 4. The expected

score given a specific ability value 0 is given by Equation 1. Assuming that ml 2 is our critical point to get the corresponding

0 value that satisfies such a criterion, we have the following:

m

2 j xP ix (8) = ^. (19)

x=l L

Therefore the function that needs to be solved is

m

f(8)='£xP ix (0)-^ = 0. (20)

X— 1 1

Given that

dP ix (0)

de

*i P ix

2> IC (d)

c= 1

the first derivative of/(0) with respect to #,/'(#), is given by

m

f(0)=J jX a i P lx (e)

X=1

-I JCPAO)

c=l

The approximate value of LI IRF using the partial credit models is

fW

@t +1 _ @t'

f (o t )

= o t -

x=\

Yj xa i p ix ( e t)

l cp ic N

( 21 )

( 22 )

(23)

LI| RF of the Graded Response Model's Items

The formula for obtaining a category x on item i using a general form to the graded response model is given by

p lx (0) = p * x (9)-p* x+l (e),

where P*. (6) is given by the formula of a two-parameter logistic model, as follows:

1

K( p )

1 + exp [-a, (6 - b lx )\

(24)

(25)

8

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Therefore the function that needs to be solved is

m

m = l x = «■

x=l

Given that

m m

X—i X=1

and

dP* (0)

-^- = o i p ;w[i -p*M>

the first derivative off (6) with respect to 9,f'(6), is given by

f(e) = J j a i [P* x (0)(l-P* x m)].

X=1

The approximate value of LI 1RF using the GRM is

(26)

(27)

(28)

(29)

IX K)

X=l

m

(30)

For a given polytomous item with three response categories, there is a correspondence between the LI IRF and the

ICRFs-based Lis (see Equation 7 and Figures 2 and 3). For items with more than three response categories, the values of

these indices are different (see Figures 1 and 4).

An Extended Example

The following empirical example is an illustration of computing the Lis for a polytomous item. Table 2 provides GPCM

parameters for five four-category items and three six-category items from the National Assessment of Educational

Progress. The corresponding LI for each item is calculated using the formulas presented in the previous sections.

Ability

-E(X)

Figure 3 Item response function for a three-category item.

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

9

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Ability

-E(X)

Figure 4 Item response function for a five-category item.

Table 2 Item Parameters and the Corresponding Location Indices

Item i

GPCM parameters

Location indices

a i

b a

b a

b i 3

K

b s

LImean

LItrimmed mean

LImedian

LIirf

1

0.600

-0.490

0.810

1.770

0.700

0.810

0.810

0.720

2

0.680

-0.300

0.900

1.830

0.810

0.900

0.900

0.830

3

0.620

-0.390

-0.590

0.230

-0.250

-0.390

-0.390

-0.290

4

1.170

0.360

1.210

1.640

1.070

1.210

1.210

1.100

5

0.640

-0.170

-0.200

0.610

0.210

-0.170

-0.170

0.050

6

0.164

13.407

-7.203

-1.454

2.022

4.858

2.326

1.809

-2.022

1.468

7

0.465

-4.274

-0.21

0.611

-0.343

0.478

-1.054

-0.025

-0.210

-0.305

8

0.296

-0.287

-1.432

0.905

-2.970

-1.414

-0.946

-1.044

-1.414

-1.029

As an example, the parameters for Item 1 are a^O.60, b u = — 0.49, fc> 12 = 0.81, and b 13 = 1.77. Therefore LI mean

LI mean = IXM3) = M- 4 ? + 0.81 4- 1.77)/3 = 0.70, and LI trimmed mean = LI median = b u = 0.81. Regarding the LI IRF for

Item 1, it is computed using Equation 23. Suppose we choose 9 0 = 0 as an estimate for the LI IRF so we can update this

estimate using Equation 23, where m = 3. Hence after two iterations, LI IRF = 0.72.

Because the four proposed Lis are identical for the case of three-category items, such items are not reported in Table 2.

From the table, it is obvious the LI trimmed mean and LI median are equal for the four-category items (e.g., Items 1-5). For

items with more than five response options, these two Lis are different (e.g., Items 6-8).

From the nature of the different proposals for the LI, we can infer some results. One result is that the LI mean does

not reflect the order of the thresholds, or b ix ; its value stays the same even though these parameters differ in their

rank order (i.e., reversals). Alternating the values attached to these categories will not affect any of the ICRF-based

indices. This characteristic holds for both the LI trimmed mean and LI median . With regard to the LI IRF , it has a sound foun¬

dation gained from the IRF, and therefore it is more appropriate to represent the difficulty of a polytomous item as a

whole.

Discussion

Global indices that reflect the location of a polytomous item with ordered-response options were proposed. The method¬

ology depending on a polytomous item’s multiple response curves (i.e., ICRFs) or a single expected curve (i.e., IRF) is

also valid for dichotomous items. In other words, the difficulty parameter for a dichotomous item can be obtained using

either approach (i.e., studying the two response curves or the correct response curve). To illustrate using a two-parameter

10

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Figure 5 Item category response functions for a two-category item.

P-0

P_1

Figure 6 Expected score for a two-category item.

logistic model (Lord & Novick, 1968) to model responses of a dichotomous item, we can have the intersection point of

the two characteristic curves of such an item (i.e., the curves that correspond to correct and incorrect answers) at the

difficulty parameter, b r Figure 5 shows an example of the curves of correct and incorrect responses of a dichotomous item

intersecting in a point corresponding to 6 = 1.0 (the item difficulty parameter). This figure is based on the two curves rep¬

resenting the ICRFs, the first approach used in the polytomous case. The three forms of an LI (i.e., LI mean , LI trimmed mean ,

and LI mean ) are simplified to produce one form for dichotomous items because they have only two response options

(correct-incorrect or perfect-zero scores).

While based on the point of view of expected item score or IRF, the point on the ability scale corresponding to an

expected score of .05 for dichotomous scoring represents an index of item difficulty. Figure 6 shows the correct-response

curve of the same item that has a difficulty parameter of fi, = 1.0. This curve also represents the expected score conditional

on ability level (i.e., the IRF), and it is obvious that the ability level corresponding to an expectation value of .50 is 9 = 1.0.

This provides the basis for the second approach.

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

11

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

LIs of polytomous items have several potential uses. One potential use is in test assembly. An index representing an

item’s location along the latent variable replaces the multiple category parameters. By such an index, the characteris¬

tics of a polytomous item can also be represented by two main item parameters (i.e., a and the LI), in addition to the

category-related parameters. Also, the number of categories is important; the area under the item information function

summarizes the amount of information an item can give. For any GPCM item, for instance, this area is ma (Huynh,

1994, 1996). Hence, when tests are created using polytomous items, these two parameters per item would be useful for

selecting items to satisfy target test specifications. Many testing programs use mixed-format tests in which both selected-

response items and constructed-response items are included. One main target in test specifications of a test is the mean

difficulty. In many cases, test developers only consider the section of selected-response items and ignore the other section

of constructed-response items for different reasons such as they are few relative to the test length. In such cases, the LI of

these ignored items can help test developers achieve the target test difficulty. Test assembly and construction of parallel

forms can be easily done using the results of the current report in conjunction with Chang and Mazzeo’s (1994) results.

We have been able to quantify the IRF of a polytomous item into a single value that Chang and Mazzeo proved that, for

any two items following any of the studied models, can be treated as equivalent, provided that their IRFs coincide.

A second potential area for using the LI is computerized adaptive testing (CAT). Based on the literature of testing with

polytomous items, in particular adaptive testing, some item selection methods are natural extensions of those used with

dichotomous items, including information indices. The information-based item selection approach may consider the item

as a whole or at the score category level. Dodd et al. (1995) commented that only the information-based item selection

algorithms have been investigated for the CRM and PCM because there is no single index or summary of the multiple

location (or scale value) parameters in these models. In the context of adaptive testing, the item selection approach based

on matching the difficulty can be presented such that an individual’s estimated ability level is matched to a polytomous

item’s LI. In particular, four proposed item selection methods in polytomous adaptive testing are built based on the alter¬

native forms of the polytomous-item LI. The choice of the next item to be administered is based on each form of the

proposed index that matches the current ability estimate. For example, considering the LI IRF computed for an item under

a polytomous response model, the next item for administration is chosen based on matching LI IRF to the current estimate

of an examinee’s ability.

Lima Passos, Berger, and Tan (2008) presented some findings regarding the item’s (b a + b i2 )l 2. This index, based on

results reported in this report, corresponds to the mean of item category thresholds, LI mean , where all LIs, such as LI IRF in

the case of three-category items, are equal, as shown in Equation 7. Lima Passos et al. found that the smaller the difference

given by [(b n + b j2 )l 2] — 9, the better (i.e., the more accurate) the tailoring between a selected item i and the underlying

trait 9 is. This is the core idea of the matching LI procedure in polytomous adaptive testing and one of the main applications

of polytomous item LIs. Other applications are possible, such as extending the idea of a single location index to item sets

or testlets by proposing an overall testlet LI. How well these indices work for CAT and for test assembly remains a question

for further study.

References

Akkermans, W., & Muraki, E. (1997). Item information and discrimination functions for trinary PCM items. Psychometrika, 62,

569-578.

Ali, U. S. (2011). Item selection methods in polytomous computerized adaptive testing (Unpublished doctoral dissertation). University of

Illinois, Urbana, IL.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.

Andrich, D. (1982). An extension of the Rasch model for ratings providing both location and dispersion parameters. Psychometrika,

47, 105-113.

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In E M. Lord & M. R. Novick (Eds.),

Statistical theories of mental test scores (pp. 397-479). Reading, MA: Addison-Wesley.

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psy¬

chometrika, 27, 29-51.

Chang, H.-H., & Mazzeo, J. (1994). The unique correspondence of the item response function and item category response functions in

polytomously scored item response models. Psychometrika, 59, 391-404.

Dodd, B. G., de Ayala, R. J., 8t Koch, W. R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Mea¬

surement, 19, 5-22.

12

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

U. S. Ali et al.

IRT Location Indices for Ordinal Polytomous Items

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.

Huynh, H. (1994). On equivalence between a partial credit item and a set of independent Rasch binary items. Psychometrika, 59,

111-119.

Huynh, H. (1996). Decomposition of a Rasch partial credit item into independent binary and indecomposable trinary items. Psychome¬

trika, 61, 31-39.

Lima Passos, V., Berger, M. P. F., & Tan, F. E. (2008). The D-optimality item selection criterion in the early stage of CAT: A study with

the graded response model. Journal of Educational and Behavioral Statistics, 33, 88-110.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.

Lord, F. M., &Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.

Mellenbergh, G. J. (1995). Conceptual notes on models for discrete polytomous item responses. Applied Psychological Measurement,

19, 91-100.

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16,

159-176.

Muraki, E. (1993). Information functions of the generalized partial credit model. Applied Psychological Measurement, 17, 351-363.

Muraki, E„ & Bock, R. D. (2003). PARSCALE 4 for Windows: IRT based test scoring and item analysis for graded open-ended exercises

and performance tasks [Computer software]. Lincolnwood, IL: Scientific Software International.

Nering, M. L., & Ostini, R. (2006). Polytomous item response theory models. Thousand Oaks, CA: Sage.

Nering, M. L., & Ostini, R. (2010). New perspectives and applications. In M. L. Nering & R. Ostini (Eds.), Handbook of polytomous item

response theory models (pp. 3-20). New York, NY: Routledge.

Rost, J. (1988). Measuring attitudes with a threshold model drawing on a traditional scaling concept. Applied Psychological Measurement,

12, 397-409.

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Psychometrika Monograph No. 17). Richmond,

VA: Psychometric Society.

Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577.

Verhelst, N. D., & Verstralen, H. H. F. M. (2008). Some considerations on the partial credit model. Psicoldgica, 29, 229-254.

Suggested citation:

Ali, U. S., Chang, H.-H., & Anderson, C. J. (2015). Location indices for ordinal polytomous items based on item response theory (Research

Report No. RR-15-20). Princeton, NJ: Educational Testing Service, http://dx.doi.org/10.1002/ets2.12065

Action Editor: Matthias von Davier

Reviewers: Shelby Haberman, lohn R. Donoghue, and Peter van Rijn

ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). Ail other trademarks are property of their

respective owners.

Find other ETS-published reports by searching the ETS ReSEARCHER database at http://search.ets.org/researcher/

ETS Research Report No. RR-15-20. © 2015 Educational Testing Service

13

Lire la suite

- 412.08 KB
- 15

##### Vous recherchez le terme ""

19

50

54