Termes les plus recherchés
[PDF](+30👁️) Télécharger ERIC EJ1132220: Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory pdf
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was administered to a sample of 600 students. The data obtained was analysed using XCALIBRE 4 and SPSS 20v softwares to determine items parameters base on IRT models. Indicate that, the test measure single trait by satisfying the condition of unidimensionality. Similarly, the goodness of fit test revealed that, the two parameter IRT model was more suitable since no misfit item was observed and the test reliability was 0.86. The mean examinee ability was 0.07 (SD = 0.94).The mean item difficulty was -0.63 (SD = 2.54) and mean item discrimination was 0.28 (SD = 0.04). 16 (33%) items were identified as "problematic" based on difficulty indices, 35 (7Télécharger gratuit ERIC EJ1132220: Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory pdf
International Journal of Evaluation and Research in Education (IJERE)
Vol.5, No.4, December2016, pp. 261-270
ISSN: 2252-8822
□ 261
Evaluation of Northwest University, Kano Post-UTME Test
Items Using Item Response Theory
Ado Abdu Bichi, Hadiza Hafiz, Samira Abdullahi Bello
Department of Arts and Social Sciences Education, Northwest University, Kano-Nigeria
Article Info
Article history:
Received Aug 21, 2016
Revised Okt 21, 2016
Accepted Nov 27, 2016
Keywords:
Economics
Item analysis
Item response theory
Nigerian universities
Post-UTME
ABSTRACT _
High-stakes testing is used for the purposes of providing results that have
important consequences. Validity is the cornerstone upon which all
measurement systems are built. This study applied the Item Response Theory
principles to analyse Northwest University Kano Post-UTME Economics test
items. The developed fifty (50) economics test items was administered to a
sample of 600 students. The data obtained was analysed using XCALIBRE 4
and SPSS 20v softwares to determine items parameters base on IRT models.
Indicate that, the test measure single trait by satisfying the condition of
unidimensionality. Similarly, the goodness of fit test revealed that, the two
parameter IRT model was more suitable since no misfit item was observed
and the test reliability was 0.86. The mean examinee ability was 0.07
(SD =0.94).The mean item difficulty was -0.63(SD=2.54) and mean item
discrimination was 0.28 (SD=0.04). 16 (33%) items were identified as
“problematic” based on difficulty indices, 35(71%) also failed to meet the set
standards on the basis of discrimination parameters, it can be concluded that,
using the IRT approach, the NWU Post-UTME items are not stable as far as
item difficulty and discrimination indices are concerned. It is recommended
that, the Post-UTME items should be made to pass through all process of
standardisation and validation; test development and content experts should
be involve in developing and validating the test items in order to obtain valid
and reliable results which will lead to valid inferences.
Copyright © 2016 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Ado Abdu Bichi,
Faculty of Education,
Northwest University, City Campus, Kofar Nassarawa, PMB 3220 Kano Nigeria.
Email: adospecial@gmail.com
1. INTRODUCTION
Assessment of students learning is an indepensable part of educational process, the major aim of
assessment is to measure students 1 achievement in order to make a variety of decisions based on learners 1
performance example, to know the present level of students’ learning and the extent to which they are ready
for next learning experiences [1], The nature and the quality of information gathered from the achievement
test can control the educational development efforts and direct the instruction [2]. Test is used for a number
of purposes, which include; improving instructional planning, motivating learners to improve their
performances, licencing, certification, selection e.t.c. In addition, tests are used in certifying students as
having attained specific levels of achievement [1],[3], Information from the Achievement tests helps to know
the extent at which students’ progresses beyond the minimum basics, the extent to which students achieved
the learning goals of the course, and then whether the learners are ready for the next learning experience [4],
University is regarded as the single and most important industry for the production of high-level
manpower in Nigeria. To support these principles the stakeholders in Nigerian university education sector
tend to guard jealously the integrity of the university and the quality of graduates produced [5]. However, in
recent years the integrity attached to Nigerian universities seems to have faded away. This is evident in the
Journal homepage: http://iaesjournal.com/online/index.php/IJERE
262 □
ISSN:2252-8822
way and manner in which the stakeholders in the university education sector maintain the constant criticisms
of the admission procedures and quality of graduates produced by Nigerian universities today.
Until recently, admission into Nigerian universities is exclusively through the Unified Tertiary
Matriculation Examination (UTME) conducted by the Joint Admissions and Matriculation Board (JAMB).
However, due to the universities’ loss of confidence in the UTME, resulting from lack of correlation between
candidates’ UTME scores and their performances in university examinations, the universities have
introduced the Post-Unified Tertiary Matriculation Examination (Post-UTME).
The controversies surrounding the introduction and sustenance of Post-UTME by Nigerian
universities, though well intentioned, is giving a lot of concern to stakeholders, as it makes the process of
securing admission into universities cumbersome and expensive. Yet, the need to ensure that competent
candidates are admitted into universities cannot be compromised, as witnessed prior to the introduction of
Post-UTME.
1.1. Post-Unified Tertiary Matriculation Examination (Post-UTME)
The obvious weaknesses observed in the process of admitting entrants into Nigerian universities
through UTME and the incessant decline in the expected roles of universities in the Nigeria, couple with the
several calls for an alternative method of admission into Nation’s universities. Example, [6] noted that for
many years Nigerian universities were able to admit only 7 to 10% of the applicants. This low number of
students being offered admission annually resulted in both applicants and their parents being very desperate
in their bid to be the few offered admission in these universities. This ushered in different types of
examination malpractices in the conduct of UTME.
These and many other concerns by stake holders resulted in the federal government of Nigeria under
the then, leadership of President Olusegun Aremu Obasanjo resolve to grant power to universities through
the then Education minister Mrs. Chinwe Abaji to conduct screening tests (Post-UTME) for admission into
their various undergraduate programmes in 2005. Under this policy it became mandatory for all universities
in the country to organise a screen test for prospective candidates after passing their UTME and before
offering them a place into their programmes. Moreover, candidates who scored the required cut-off marks in
UTME are shortlisted by JAMB and sent to the universities of their choices. Thereafter, the universities
would then screen the candidates using oral interviews, aptitude test, or even another examination [7],
Although Post-UTME was introduced into Nigerian university to help ameliorate lots of problems in
the system, several problems were identified by stakeholders in the country’s university education. Several
studies have confirmed that, the contents of UTME and Post-UTME items are not the same and that, the
Post-UTME items are difficult than the UTME. Similarly, studies have also confirmed the failure of Post-
UTME to predict the future academic success of undergraduate students of Nigerian universities [7]-[12],
Moreover, little attention has been paid, by researchers in universities to the in-depth analysis of the
items contained in the Post-UTME. It is not surprising that some of the items being used for taking decisions
on students are not good. A comprehensive study of the process in which the Post-UTME items were
constructed as well as their psychometric characteristics may suggest ways of improving students’
performance in Post-UTME.
1.2. Purpose of The Study
The main Purpose of this study is to evaluate the quality of the Nigerian universities’ Post-UTME
economics test items. It is organised to meet the following specific research objectives:
i To Examine the model fit of the Northwest University (NWU) Post-UTME Economics test items to
IRT models
ii Test for violations of essential IRT basic assumptions of unidimensionality and local independence
iii Identify the distribution of item discrimination values, difficulty, and pseudo-guessing parameters
for the NWU Post-UTME Economics test items.
1.3. Research Questions
i To successfully attain the set objectives of this study, the following research questions will be
addressed. Thus;
ii Do the NWU Post-UTME Economics test items fit into the IRT models ?
iii Do the NWU Post-UTME Economics test items satisfy the essential IRT basic assumption of
unidimensionality and local item independence ?
iv What are the item parameters (discrimination values, difficulty, and pseudo-guessing parameters) of
the NWU Post-UTME Economics test items ?
HERE Vol. 5, No. 4, December 2016 : 261 -270
IJERE
ISSN: 2252-8822
□ 263
2. ITEM RESPONSE THEORY
According to [13] Item response theory (IRT) is a set of latent variable techniques especially
designed to model the interaction between a subject’s “ability” and the item level stimuli (difficulty,
guessing, etc.)- The focus is on the pattern of responses rather than on composite or total score variables
and linear regression theory. The IRT framework emphasizes how responses can be thought of in
probabilistic terms. In IRT the item responses are considered the outcome (dependent) variables, and the
examinee’s ability and the items’ characteristics are the latent predictor (independent) variables [14].
The characteristics of Item Response Models, as summarised by [15] are, first, an IRT model must
specify the relationship between the observed response and underlying unobservable construct. Secondly, the
model must provide a way to estimate scores on the ability. Third, the examinee’s scores will be the basis for
estimation of the underlying construct. Finally, an IRT model assumes that the performance of an examinee
can be entirely predicted or explained by one or more abilities.
2.1. Assumptions of Item Response Theory
In using IRT Model, it is very important to assess the extent to which the IRT model assumptions
are valid for the given data [16], The most significant assumptions common to all IRT models is
unidimensionality, other assumption relates to IRT is Local independence. The test data can only be valid for
latent trait model estimation only if these assumptions are met.
2.1.1. Unidimensionality
Unidimensionality states that the items in a test measure single unidimensional ability or trait
and that the items form a unidimensional scale of measurement [17]. The Item response models that assume a
single latent ability is referred to as unidimensional. This assumption means that the items measure only one
area of ability or knowledge. This assumption is empirically assessed by investigating whether dominant
factor exists among all the test items [18].
2.1.2. Item Local Independence
The assumption of local independence means that, the probability of an examinee getting item
correctly is not affected by the answer given to other items in the test. For example, if the responses to one
item structurally constrain the possible answers to other items, then the items are not locally independent. If
these assumptions are met, an IRT model can be successfully employed [19].
2.2. Item Response Theory Models
[20] Summarised the models when he said “IRT models differ depending on whether the relationship
between item performance and knowledge is considered a one-, two- or three-parameter logistic function.
2.2.1. The one-parameter logistic model
The 1-parameter model explains the relationship between the ability and probability of a correct
response on the item in terms of the item difficulty. An item’s difficulty parameter (b) is the point on the
ability scale corresponding to the location on the item characteristic curve (ICC) where the probability of a
correct response is 0.5 [18],
P(0) = e a ( e 7l - e a W (1)
where:
P (0) = ability of a student
a(0) = difficulty level of item
e =2.73= discrimination index
b(Q) = 1 in this model.
2.2.2. The two -parameter logistic model
The 2-Parameter model makes use of the b parameter (item difficulty) just as in the 1PLM,
and addition add an element that indicates how wills an item separates students into different ability levels
this parameter is called item discrimination (a). The item discrimination (a) parameter used in the 2PLM is
equal to the slope of the item characteristics curve when it is at its steepest [18],
P(0) =
1
1+ e -i-7a(0-b)
( 2 )
Evaluation of Northwest University, Kano Post UTME Test Items Using Item .... (Ado Abdu Bichi)
264 □
ISSN:2252-8822
2.2.3. The three-parameter logistic model
The 3PL model builds upon the two-parameter model by adding pseudo-chance-level parameter c.
The c parameter is the value of the lower asymptote of the item characteristic curve and is indicative of the
probability that an examinee with a very low ability score would answer an item correctly.
P(0) = c +
l-c
l +e -l-7a(0-b)
(3)
2.3. IRT-Item Characteristic Curves (ICC)
An item characteristic curve plots the probability that an examinee will respond correctly to an item
solely as a function of the test’s latent trait [21]. The values on the X-axis of an ICC represent the latent trait,
usually ranging from -3 to +3. The Y-axis represents the probability of an examinee’s success. As the latent
trait increases, the probability of the examinee responding correctly will increase but with diminishing
returns.The probability of a correct response is determined by the item’s difficulty and the examinee's ability.
This probability can be seen as illustrated using item Characteristic curve (ICC) in Figure 1.
Figure 1. Item Characteristics Curve [22]
From this ICC we can observe that as the examinee's ability increases, the probability of a correct
response increases; this is what you would expect in practice.The earlier discussion and equation suggests
that the probability of endorsing an item correctly or a correct response is 0.5 for any examinee whose ability
is equal to the value of the item difficulty.
2.4. Item Response Theory Item Analysis Statistics
Test item analysis involved statistics that help in analysing the effectiveness of the items
and improving test items. These statistics can provide useful information to determine the validity
and accuracy of an item in describing learners or examinees ability from their response to each of the item in
a test. The common Item Response Theory item analysis statistics are Item discrimination (a-values). Item
difficulty (b-values). Pseudo Guessing (c-values) and (d) Reliability. This paper will therefore cover the two
major statistics of Item difficulty (b-values) and discrimination (a-values). The values of the IRT item
parameters will be interpreted base on the XCALIBRE recommended acceptable ranges of item parameters
a> 0.30; -3.0 <b< 3.0 [23].
3. RESEARCH METHOD
3.1. Design of The Study
This study is a quantitative and survey design will be adopted to collect the relevant data for the
study. The IRT models guide the study. The IRT is used in order to overcome the limitations of the Classical
Test Theory (CTT) [24],
HERE Vol. 5, No. 4, December 2016 : 261 -270
IJERE
ISSN: 2252-8822
□ 265
3.2. Participants
The population of this study comprises the entire senior secondary schools three (SS III) students in
Nigerian. 550 SS III students, age (16-18) selected using a stratified random sampling technique participated
in the study. The SS III students were selected because they have through their SSI, SS II and III been taught
all the relevant topics in Economics and are been prepared to write their final examination in the same area.
3.3. Instrument for Data Collection
The Northwest University, Kano 2014/2015 and 2015/2016 Post-UTME (NWU Post-UTME)
questions which were designed and constructed for assessing the suitability of the prospective students to be
admitted for lOOlevel undergraduate programmes of Northwest University, Kano was used. The NWU Post-
UTME uses multiple-choice items format with four answer choices/options (A-D). The NWU Post-UTME
items were used for 2014/2015 and 2015/2016 admission exercises respectively and may likely be replicated
in the coming admission exercises.
3.4. Data Collection Procedure
The NWU Post-UTME 50 multiple-choice items were administered to the sample after receiving
specific instruction for the test by the researchers with the help of research assistants and the teachers in the
samples schools. The test items were dichotomously scored and the students’ responses and scores from the
test are used for the analysis.
3.5. Method of Data Analysis
Data collected were scored dichotomously (right = 1, wrong = 0), the data file was prepared using
Microsoft Excel 2010. The IRT item analysis and detection of poor items were carried out. The two
psychometric properties of the items were determined. Two specialised Softwares (i.e. XCALIBRE 4.2
and SPSS 20v) were used in order to analyse the Economics Test items in this study. The SPSS 20V was
used to assess the most important assumption common to all IRT models (i.e unidimensionality). Principal
component factor analysis was carried out, and the eigenvalues were checked. To estimate students’ abilities
and item difficulty and discrimination for the tests, as well as the goodness of fit of the items according to
IRT, XCALIBRE 4.2 software was used for the analysis.
4. PRESENTATION OF THE RESEARCH FINDINGS
The result of this study as explained in the method of data analysis above is presented in the form of
IRT analysis. Similarly, all results were presented under each research question. Table 1 presents the Post-
UTME Economics test itemssummary statistics. The total number of items in the test is forty nine (49)
and the number of students who sat for the test is 600 as presented in the second column of the table. The
overall reliability which is called internal consistency reliability coefficient of the test as measured by the
Cronbach’s Alpha is 0.86; this shows that the test is reliable since the coefficientis high, greater than the [25]
recommended anacceptable value of 0.70. Similarly the items mean score is 28.82 with the standard
deviation of 3.59. The mean item difficulty (b-values) is -0.63; the mean item discrimination of the test (a-
values) is 0.28 and the mean student's ability (0) is 0.074 as presented.
Table 1. Summary Statisticsfor all Calibrated Items
Items
Examinees
Reliability
Mean
Means
Mean
Mean
(n)
(n)
(Alpha)
Scores
b
a
e
49
600
0.86
28.82 (3.59)
-0.63 (2.54)
0.28 (0.04)
0.074(0.94)
Research Question 1: Do the NWU Post-UTME Economics test items fit into a 1 -parameter logistic
(1PL)/Rasch, 2-parameter logistic (2PL), and 3-parameter logistic (3PL) IRT models ? The results for items
fit in the XCALIBRE output for 1PL, 2PL and 3PL analysis of test data are given in the form of a graph
which displays of the fit between the IRF and the observed proportions of correct responses across
examinee's ability levels. Thep-values associated with two popular statistical tests for identifying item fit.
Chi-square and standardized residual (z) with aprobability of less than 0.05 (p < .05) is signalling items
misfit. As maintained by [26] given the sensitivity of the Chi-square test to sample size, its p-values for this
test may be ignored and use the p-values associated with the standardized residual, z, instead. Table 2
provides the number of items identified as misfitting the given IRT models at the [Alpha] =.05 level.
Evaluation of Northwest University, Kano Post UTME Test Items Using Item .... (Ado Abdu Bichi)
266 □
ISSN:2252-8822
Table 2. Summary of items fitting each model
IRT Model
1PL
2PL
3PL
Number of Items fitting the model
47
49
38
Non-fitting Items
30 & 45
0
11,29, 39 & 44
% of items fitting the model
95.9%
100%
91.8%
From the summary of standardised residual (z Resid) result fit test above, two items (Item30 and 45)
and four items (11, 29, 39 and 44). This means that4.1% and 8.2% of the total items in the test were
statistically significant and did not fit the 1PL and 3PL models respectively. However all the 49 items were
not statistically significant and fitted the 2PL model 0.05 level of significance. Thus 2PL as the model with
the most compatibility to the test data, where the entire 49 test items fitted in was found suitable
and therefore used to estimate the item statistics based on the IRT model in this study. Research Question 2 :
Do the NWU Post-UTME Economics test items satisfy the essential IRT basic assumption of
unidimensionality and local item independence ?
The results of the factor analysis produce eighteen (18) items with eigenvalues greater than one.
These 18 factors explain 80.22% of the variance. The first eigenvalue was 5.84 higher than the next
eigenvalue (i.e 3.72, 3.10, e.t.c).The first factor explained 11.67% of the variance; the second factor
explained 7.44% of the remaining variance. The remaining variances were explained by other 31 factor.
Hence, there is one dominating factor in the factor structure of the item set. Since there is adominating factor
that explained 11.67% of the variance the assumption of unidimensionality is established. The result of the
eigenvalue test produced the scree plot to determine whether the dimensionality could be inferred. Looking at
Figure 2, the eigenvalue of the first factor was larger compared to the second factor, and the eigenvalue of the
remaining factors are all about the same.
Scr»« Plot
Component Number
Figure 2.Scree plot for eigenvalues of Post-UTME EAT
Research Question 3: What are the item parameters (discrimination values, difficulty, and pseudo¬
guessing parameters) of the NWU Post-UTME Economics test items ? The item parameters of Post-UTME
Economics Achievement test generated using IRT framework [i.e item difficulty (threshold or b) and item
discrimination (slope or a)] are presented inTable 3.
Figure 3 displays the Test Information Function for all forty calibrated items. This present the
amount of information the test is providing at each level of ability or theta (0). The maximum information
provided by the test was 2.022 at an ability or theta level (0) of-1.600.
HERE Vol. 5, No. 4, December 2016 : 261 -270
IJERE
ISSN: 2252-8822
□ 267
Table 3.Item Parameters for Dichotomously Scored Post-UTME EAT Items
Item
Item Difficulty
(b)
Item Discrimination
(a)
Item
Item Difficulty
(b)
Item Discrimination
(a)
i
-4.00
0.42
26
-4.00
0.32
2
1.09
0.29
27
-2.33
0.27
3
-2.12
0.34
28
-4.00
0.30
4
-1.63
0.28
29
-1.99
0.25
5
2.49
0.24
30
4.00
0.25
6
4.00
0.28
31
-0.85
0.27
7
1.85
0.27
32
1.67
0.26
8
1.48
0.30
33
-4.00
0.32
9
-3.19
0.28
34
3.08
0.23
10
2.53
0.27
35
-4.00
0.31
11
-0.51
0.25
36
-2.45
0.28
12
-1.73
0.26
37
0.91
0.24
13
-4.00
0.31
38
-0.28
0.29
14
0.89
0.25
39
-2.78
0.27
15
-2.67
0.28
40
-1.63
0.27
16
-1.33
0.25
41
1.71
0.21
17
1.37
0.23
42
-3.83
0.29
18
4.00
0.30
43
0.61
0.28
19
***
***
44
2.87
0.32
20
-4.00
0.28
45
1.44
0.31
21
-3.22
0.31
46
-0.51
0.25
22
-3.25
0.25
47
-4.00
0.31
23
-1.64
0.23
38
-0.82
0.28
24
2.18
0.28
49
1.34
0.24
25
1.29
0.30
50
-0.85
0.27
*items of special interest are in bold
***Item 19 was automaticallyremoved by XCLIBRE because it has NO VARIANCE
TIF
®2014ASC
Figure 3. Test Information Function
5. DISCUSSION OF FINDINGS
The focus of this study is to evaluate the quality of the NWU Post-UTME economics test items
using Item Response Theory (IRT) modeling.
5.1. Research Questions 1
The first question relates to IRT models assumption verification. The result of the factor analysis
produced 18 items with eigenvalues greater than one. These 18 factors explain 80.22% of the variance.
The first eigenvalue was 5.84 higher than the next eigenvalues.The first factor explained 11.67% of the
variance; the second factor explained 7.44% of the remaining variance. The rest of the variance was
explained by other 31 factor. Hence, there is one dominating factor in the factor structure of the item set.
Since there is adominatingfactor that explained 11.70% of the variance the assumption of unidimensionality
is established. The result of the eigenvalue test produced the scree plot to determine whether the
dimensionality could be inferred. By this the test measures a unidimensional construct.
5.2. Research Questions 2
To test the model data fit, standardised residual (z Resid) fit test was used. All the 49 items were not
statistically significant and fitted the 2PL model 0.05 level of significance. Thus 2PL as the model with the
Evaluation of Northwest University, Kano Post UTME Test Items Using Item .... (Ado Abdu Bichi)
268 □
ISSN:2252-8822
most compatibility to the test data, where the entire 49 test items fitted in was found suitable and therefore
used to estimate the item statistics based on the IRT model in this study.
5.3. Research Questions 3
The second question is dealing with the determination of the Post-UTME economics test item
parameters. The parameters were determined and presented in Table 3 with items of special interest been
highlighted in bold. Similarly, Figure 3 represents the Test Information Function for all 49calibrated items.
The maximum information provided by the test was 2.022 at an ability or theta level (0) of -1.600.
The results are interpreted base on the XCALIBRE recommended acceptable ranges of item
parameters a> 0.30; -3.0 <b< 3.0 [23]. The findings reveal on the basis of b (difficulty) that 16 (33%) of the
items were problematic. The Item Characteristic Curves of the test items show different behaviour (some
were difficult while others wereeasier). Generally, on the basis of difficulty, 33 (67%) of the Items were of
acceptable difficulty level. Example Figure 4 , presents the ICC obtained for item 1, shows that the value
obtained on the ability scale (difficulty parameter estimate) 0.5 probability of examinees getting the item
right is low (-4.00) this means the item is very easy for the students. However, Figure 5, obtained for items 6
shows that the value obtained on the ability scale 0.5 probability of examinees getting the item right is high
(4.00) this means the item is difficult for the students. These difficult items should be rejected, modified or
eliminated from the test completely.This finding is consistent with the findings of [26]-[28] whose findings
revealed that, themajority of the items were acceptable as far as difficulty of the item.
The findings on the basis of item discrimination indices, the results indicates that 35 (71%) items
failed to differentiate between students of different abilities having possessed poor or marginal discriminating
ability.however 14 (29%) of the items were of moderate and good discriminating ability. Similarly, these 35
items which present a marginal and poordiscriminating abilities of<0.30, cannot differentiate substantially
between low and higher achieving students, the items therefore need to be reviewed or should be rejected.
The poor performance of the identified test items and the students could have been due to poor
understanding of difficult topics, ambiguity in wordings of the questions or even inappropriate key; it may
also be due to personal variations in students’ intelligence level [29]. This finding disagree with the that of
many studies example;[27]-[30] whose findings revealed majority of the test items used in their studies
(i.e more than 50%) were within acceptable level of item difficulty and discrimination.
Hem 6
Figure 4. ICC of Item 1
Figure 5. ICC of Item 6
6. CONCLUSION AND RECOMMENDATIONS
Findings of this studyindicates that the items of the qualifying examination are not stable as far as
item discrimination and item difficulty indices are concerned using the IRT Frame works. Item analysis results
generated may be influenced by many other factors which include examinees having apoor understanding of
difficult topics, ambiguity in wordings of the questions or even inappropriate key, instructional procedure
applied, it may also be due to personal variations in students’ intelligence level. Similarly, the results of this
study shows the need to improve the Post-UTME test items of Nigerian Universities especially in developing
valid and reliable items, through the mandatory involment of expert in test, measurement and evaluation in
the process.
It is recommended that IRT should be maintained in development and analysis of test items, because
of its position in the investigation of reliability and in minimizing measurement errors.In compliance with the
HERE Vol. 5, No. 4, December 2016 : 261 -270
IJERE
ISSN: 2252-8822
□ 269
standards and current practice in development and validation of test items, the ‘problematic’ items identified
in this study, having failed to satisfy the set quality criteria, should be modified, dropped, or completely
eliminated from the test.The Post-UTME test items in other subjects should be subjected to psychometric
analysis to ascertain it’s quality.
Test items in the Post-UTME should be made to pass through all process of standardisation
and validation; test development and content experts should be involve in developing and validating the test
items in order to obtain valid and reliable results which will lead to valid inferences. Finally, further study
needs to be conducted with different testitems and should include differential item functioning analysis to
ensure that valid and reliable measuring items are used in selecting prospective undergraduates into Nigerian
universities.
ACKNOWLEDGEMENTS
The authors appreciatethe Tertiary Education Trust Fund (tetfund) Ministry of Education, Federal
Republic of Nigeria for providing funding for the project through Institutional Based Research (IBR) scheme.
Smilarly, authors aknowlegde the students who participated in the tests as well as all the cooperating schools.
REFERENCES
[1] S. P. Klein and L. S. Hamilton, “Large-scale testing: Current practices and new directions (IP-182),” Santa Monica,
CA, RAND, 1999.
[2] L. F. Gurski, “Secondary Teachers’ Assessment and Grading Practices in Inclusive Classrooms,” A Thesis
Submitted to the College of Graduate Studies and Research in Partial Fulfilment of the Requirements for the
Degree of Master of Education, University of Saskatchewan, 2008.
[3] L. S. Hamilton, et al.,“Making Sense of Test-Based Accountability in Education. Santa Monica,” CA, RAND,
2002 .
[4] N. E. Gronlund and R. L. Linn, “Measurement and Evaluation in Teaching (6th Ed),” MacMillan, New York, 1990.
[5] I. K. Olayemi and O. S. Oyelekan, “Analysis of Matriculation and Post-Matriculation Examination Scores of
Biological Science Students of Federal University of Technology, Minna Nigeria,’7/onit Journal of Education, vol.
28, pp.11-18, 2009.
[6] J. Okoje, “State of Nigerian Universities,” A Paper Presented at the University of Port Harcourt Alumni
Association forum in Abuja, 2008.
[7] M. Amatareotubo, “Post-UME screening: Matters arising,” 2006. Posted to the Web: 8/30/2006 5:49:17pm:
amasmozimo@yahoo.co.uk. Retrieved January 10, 2016 from website: http://www.onlinenigeria.com/articles/
[8] U. N. Akanwa and P. C. Nkwocha, “Prediction of South Eastern Nigerian Students’ under Graduate Scores with
their UME and Post-UME Scores.’TOSR Journal of Research & Method in Education, vol/issue: 5(5), pp. 36-39,
2015.
[9] A. Ikoghode, “Post-UTME Screening In Nigerian Universities: How Relevant Today?” International Journal of
Education and Research,v ol/issue: 3(8), pp. 101-116, 2015.
[10] A. A. Bichi, “Analysis of UTME and Post-UTME scores of education students at Northwest University Kano-
Nigeria, ” Presented at the 1st International Conference on Education 2015, On 9th April, 2015 @ Novotel Beijing
Xinqiao, Beijing China, 2015.
[11] S. O. Uhunmwuangho and O. Ogunbadeniyi, “The University Matriculation Examination as a Predictor of
Performance in Post University Matriculation Examination: a Model for Educational development in the 21st
Century,” Africa Research Review, vol/issue: 8(1), 2014.
[12] K. Ebiri,“Post JAMB Basis for Admission says Obasanjo,”G«anjM«, pp. 49, 2006.
[13] R. P. Chalmers, “MIRT: A Multidimensional Item Response Theory Package for the R Environment, "Journal of
Statistical Softwares ol/issue: 48(6), pp. 1-29, 2012. http://www.jstatsoft.org/v48/i06. Assessed on 23 June, 2014.
[14] L. D. Trang, “Applying Item Response Theory Modeling in Educational Research,” Graduate Theses and
Dissertations, 2013.
[15] R. K. Hambleton and H. Swaminathan, “Item response theory: Principles and applications,’’.Springer, vol. 7, 1985.
[16] X. Fan, “Item Response Theory and Classical Test Theory: An Empirical Comparison of Their Item/Person
Statistics, “Educational and Psychological Measurement, vol/issue: 58(3), pp. 357-381, 1998.
[17] G. R. Kiany and S. Jalali, “Theoretical and Practical Comparison of Classical Test Theory and Item-Response
Theory "Iranian Journal of Applied Linguistics, vol/issue: 1(12), 2009.
[18] R. K. Hambleton, et al.,“Fundamentals ofltem Response Theory,” Newbury Park, CA, Sage, 1991.
[19] T. G. Courville, “An Empirical Comparison of Item Response Theory and Classical Test Theory Item/Person
Statistics, Ph.D Dissertation, Texas A & M University, 2004.
[20] R. E. Schumacker, “Item Response Theory,” 2010.
http://appliedmeasurementassociates.com/ama/assets/File/ITEM_RESPONSE_THEORY.pdf. Retrieved on 13
August, 2014.
[21] L. Crocker and J. Algina, “Introduction to Classical and Modern Test Theory,” Holt, Rinehart and Winston, New
York, USA, pp. 527, 1986.
Evaluation of Northwest University, Kano Post UTME Test Items Using Item .... (Ado Abdu Bichi)
270 □
ISSN:2252-8822
[22] X. An and Y. F. Yung, “Item Response Theory: What It Is and How You Can Use the IRT Procedure to Apply It,”
In Proceedings of the SAS Global Forum 2014 Conference, 2014.
http://support.sas.Com/resources/papers/proceedingsl4/SAS364-2014.pdf.Assessed on June 9 2014.
[23] R. Guyer and N. A. Thompson, “User’s Manual for Xcalibre item response theory calibration software, version
4.2,” Woodbury MN, Assessment Systems Corporation, 2014.
[24] R. K. Hambleton and R. W. Jones, “Comparison of Classical Test Theory and Item Response Theory and their
Applications to Test Development, ’’Educational Measurement: Issues and Practice, vol/issue: 12(3), pp. 38-47,
1993.
[25] J. C. Nunnally, “Psychometric Theory, 2nd Edn.,” McGraw-Hill, New York, USA., pp. 701, 1978.
[26] D. M. Dimitrov, "IRT and True-Score Analysis of the NCA Tests for Teaching Skills,” Technical Report. National
Centre for Assessment in Higher Education, 2013.
[27] S. S. Pande, et al.,“Correlation between Difficulty & Discrimination Indices of MCQS in Formative Exam in
Physiology,” South-East Asian Journal of Medical Education,vo\Iissue: 7(1), 2013.
[28] Suruchi and S. R. Rana, “Test Item Analysis and Relationship Between Difficulty Level and Discrimination Index
of Test Items in an Achievement Test in Biology ’’Paripex - Indian Journal of Research, vol/issue: 3(6), pp. 56-58,
2014.
[29] A. A. Bichi, “Item analysis using a derived science achievement test data, "'International Journal of Science and
Research (IJSR), vol/issue: 4(5). pp. 1656-1662, 2015.
[30] M. R. Hingorjoand F. Jaleel, “Analysis of One-Best Multiple Choice Questions: The Difficulty Index,
Discrimination Index and Distractor Efficiency,”/ Pak. Med. Assoc., vol. 62, pp. 142-147, 2012.
BIOGRAPHIES OF AUTHORS
Ado Abdu BICHI is a Lecturer in the Department of Arts and Social Sciences Education,
Northwest University Kano, Nigeria. He obtained his M.Ed in Educational Assessment
(Psychometric Methods) and Currently undergoing a graduate studies for the degree of
Doctor of Philosophy (Ph.D) in Educational Assessment ( Psychometric Methods ) at the
Universiti Sultan Zainal Abidin, Malaysia. His areas of teaching and research interest are
Educational and Psychological Test, Measurement and Evaluation, Item Development and
Analysis using Advanced Psychometric Models ( Classical Test Theory, Item Response
Theory and Cognitive Diagnostic Models), Research Methodologyand Educational Statistics.
He is a member of Psychometric Society and American Statistical Association
Dr Hadiza HAFIZ is a lecturer in the Department of Arts and Social Sciences Education,
Northwest University Kano.A respected academician with over 20 years teaching and
research experience, obtained herPh.D in Curriculum and Instruction from the prestigious
Bayero University, Kano-Nigeria in 2015. Her areas of teaching and research interest are
Curriculum and Instruction, Social Studies Curriculum, Teaching Methodology,
Entreprenuership Education and Research Methodology. Currently, she is theDeputy Dean
Students’ Affairs of the Northwest Universty, Kano. She is a professional teacher registered
byTeachers Registration Council (TRC)of Nigeria
Samira Abdullahi Bello is a lecturer in the Department of Arts and Social Sciences
Education, Northwest University Kano. A respected academician and researcher with over
20 years teaching and research experience in teacher education. Currently she is a Ph.D
(Curriculum Studies) scholar atBayero University, Kano-Nigeria. Her areas of teaching and
research interest are Curriculum and Instruction, Educational Technology, ICT in Education
and Research Methodology, she is a professional teacher registered byTeachers Registration
Council (TRC) of Nigeria
HERE Vol. 5, No. 4, December 2016 : 261 -270
Lire la suite
- 683.90 KB
- 15
Vous recherchez le terme ""

30

62

51