Measurement in nursing education.

Journal of Advanced Nursing 1979. 4» 47-5^

mea ion* easurement in nursing education

/

/

ohn Sheahan S.R.N. R.M.N. R.N.T. F.R.S.H. Principal Lecturer in Nursing, Faculty of Education, The Polytechnic, Huddersfield AcceptedJor publication 6 March

SHEAHAN J . {1979) Journal of Advanced Nursing 4, 47-56

Measurement in nursing education The objectives for this paper are fourfold. Evaluation is put into perspective, including measurement and assessment, in relation to the curriculum as a whole. The elements included in the evaluation of educational programmes are outlined. The process of evaluation is related to learning abilities in the cognitive, affective, psychomotor and experiential domains. Finally, some conclusions are made about the process of evaluation.

INTRODUCTION There is no doubt that we are in the midst of an explosion of knowledge and the disciplines of nursing and education are no exceptions in this respect. Those involved in nursing education, therefore, have the twin task of keeping up to date with information on at least two fronts. In the field of education and in curriculum studies in particular, the explosion of knowledge has given an exceptionally high yield of papers, monographs and books. Despite all that has been written on the subject, the curriculum, without doing it any serious injustice, can be resolved into the following three questions. Where are we going'. How are we going to get there ? And how do we know when we have arrived; This paper concentrates on the last of these questions and will be dealing w^ith concepts such as evaluation, assessment and measurement. THE C U R R I C U L U M The principal elements of a curriculum arc the aims; the learning experiences; the subject-matter and the general objectives; the specific objectives; the organization of the learning experiences and subject matter; and evaluation including * This article is based on a paper read at the Association of Nursing Education of the Royal College of Nursing of the United Kingdom conference held at the Royal College of Surgeons, London, 25 February 1978. 0309-2402/79/01 oo-oo47$o2.00

©1979 Blackwell Scientific Publications

47

48

John Sheahan

measurement and assessment. For a diagrammatic representation of these concepts see Figure i. An alternative view is provided by Figure 2. It can be seen from this, for example, that a variety of factors impinge upon the nursing curriculum in the United Kingdom. Official bodies with an influence include the Department of Health and 1 Aims of the course 6 Evaluation measurement and assessment

5 Organisation of learning experiences and subject matter

2 Learning experiences

4 Specific objectives

FIGURE I

3 Subject matter and general objectives

A simple curriculum process

DHSS GNC

AIMS INTENTIONS

RNTC RCN EEC

POLITICS PHILOSOPHY

Investigations Assignments Projects Simulations Examinations Course Work Formative 1 Summative FiGUHE 2

PROCESSES ORGANISING TEACHING/LEARNING ENVIRONMENTS OUTCOMES EVALUATION ADMINISTRATIVE CONVENIENCE

Patients Pressure Groups Reports (Briggs) Directors and Staff Learners Classroom Practical Work Library Community Validity Reliability Measurement Assessment

An alternative curriculum model

Social Security, the General Nursing Council,fRe2ionallNurse Training Committees. Royal College of Nursing and the European Economic Community. There are a variety of other factors such as the patients, pressure groups, reports (such as the Briggs Report i977)> nursing school directors and staff", and, of course, the learners. Aims Aims are a key factor in the curriculum process. The Technician Education Council (1975) (TEC) has set out the aims of TEC programmes. It is sometimes helpful to

Measurement in nursing education examine how others approach a similar task. I have, therefore, modified the TEC aims for consideration in a nursing context. 'Every prograninic should aim to develop the learners ability to think, to grasp ideas and to communicate effectively. Adaptability, curiosity, self confidence, independence of thought and the power to make critical judgements arc personal qualities that should be given every encouragement to grow. In this way the learner will learn to make decisions, to exercise initiative, to respond to change, to act as an effective member of a group and to supervise the work of others where this is required.'

If nurses ought to be prepared to think, to grasp ideas, to communicate effectively, to be adaptable, self confident and to be able to make critical judgements, then the educational process ought to take these factors into account. The educational process, as represented in Figure 2, includes investigations, assignments, projects, simulations, classroom and hbrary work, and work in the community, for example. A process is defmed as the act of proceeding, progressing or advancing. Those involved in the educational process, the recipients, can be expected, to make some advancement, for example, in the areas of knowledge and skills. It is this advancement, otherwise known as an outcome, which is evaluated through examinations, course work, formative and summative tests. Assessments and measurements are made and examiners are concerned with the validity, reliability and the administrative convenience of the methods used. Evaluation It might be helpful to define some of the concepts involved. Evaluation may be defmed as a process of determining the extent to which the educational objectives are achieved by the learners. Evaluation is a systematic process and it assumes that the educational objectives have been written. The following are alternative formulations of the concept of evaluation: 1 Evaluation—Value judgements by tutors -|- Quantitative descriptions of learner behaviour. [Measurement). 2 Evaluation=Value judgements by tutors + Qualitative descriptions of learner behaviour, i.e. Assessment [Non-measurement).

In the formulations of evaluations set out above value judgements are a common factor. This is not surprising since the literal meaning of evaluation is to work out the value of something. Measurement is only part of evaluation; and it is not an ever-present part. Objectives Aims using the TEC model have been set out but the defuiition of evaluation made reference to educational objectives and did not mention aims. Why not evaluate the aims; The answer is that aims are too imprecise and are no more than a target. Objectives on the other hand, 'intentions' on Figure 2, are m.ore specific and are defined as changes in learner behaviour which it is intended to bring about by learning. Aims must, therefore, be translated into objectives in order that

[.)

50

John Sheahatt

performances may be evaluated. Objectives are educational specifications. What they specify is the behaviour that it is intended to bring about, the environment and the equipment required, and the acceptable level of performance. Here is an example. 'After this unit of learning the learner will answer a test paper comprising 20 multiple choice items under examination conditions, that is, the test will be invigilated. The acceptable level of performance will be 16 out of 20 right answers.'

This objective specifics the behaviour required, the conditions in which the behaviour is performed and the expected level of performance. FORMATIVE AND SUMMATIVE E V A L U A T I O N Summativc evaluation means an end of course or final examination approach. Summativc evaluation has all the same characteristics as a post-mortem. Formative evaluation, on the other hand, is used at various points as a course progresses and is concerned with testing elements which have just been completed. It is suggested that this form of evaluation enables the results of tests to be used to guide learning in a diagnostic manner. Formative evaluation is appropriate for abilities within the affective domain such as the ability to work as a member of a team, leadership qualities and the ability to make decisions. Often, as in nursing, a combination of summative and formative evaluation is used. The final written examination being the summative elements, the practical nursing assessments being the formative elements. Whether evaluation is formative, summative or a combination of both, the concepts of validity and rehability apply.

Validity If a test achieves what the compilers intended, it is a valid test. There arc many aspects to validity but in this paper we will be concerned only with content and predictive vahdity. The first aspect to be considered is content validity. For an examination to have content validity it must contain qualitative and quantitative representation in the sanipHng of the whole syllabus. This rests upon thejudgement of the examiner in compiling questions from a syllabus. Essay-type examination papers often fall short of this requirement. This is something which has not escaped the notice of examination 'spotters'. Such people predict that certain topics are more likely to appear in a particular examination than others and concentrate revision around their hunches. What is more, of course, they are sometimes right. One way of ensuring content vahdity, that is a comprehensive coverage of the syllabus in terms of the stated objectives and the subject matter, is the use of objective-type items such as multiple choice questions. Another important factor in ensuring validity concerns the objectives of a course. It is necessary to know whether the questions set meet tbe objectives. There is, however, no ready-made

Measurement in nursing education

51

formula for fmding content validity. Teachers of nursing must make up their minds using their own experience. Where possible a second opinion should be sought because of the qualitative element present. Predictive validity, as its name suggests, is concerned with predicting future performance based on test results of tlic present. When it comes to entry into the nursing profession, the question is how does possession of the General Certificate of Education examinations passed at 'Advanced' or 'Ordinary' levels at school, or entry by means of the Special GNC entrance test, relate to the future performance of the learner. In the case of the fmal examination in nursing, an examiner, in passing a candidate, is in effect enabling the granting of a licence to practice nursing. The examiner is predicting that the candidate is, to use the phrase, 'a fit and proper person' to be admitted to the ranks of registered or enrolled nurses. The relationship between the various aspects of validity and other elements of the evaluation process is shown in Figure 3. WHAT? General objectives

Content VALIDITY

Specific objectives

Predictive

EVALUATION

OBJECTIVITY

RELIABILITY

HOW? FIGURE 3

Elements in the efaluathn process

Reliability A rehable test is one which will give a consistent score from one occasion to another for the same individual or group irrespective of the person who marks it. Reliability is thus concerned with the adequacy of the measurement and not the content. Another term for reliability is the consistency of a test. Marker inconsistency is a major contributor to low reliability {mark/re-mark reliability). There are, for example, inconsistencies when an examiner marks a particular paper on different occasions. This may be due to fatigue, a change of mood or a varying interpretation of what the expectations should be. Essay-type questions are especially prone to varied interpretations because of the influence of handwriting, the style and the standard of the english. A classic example of the fluctuations in essay marking is found in the work of Hartog and Rliodes (1936) (Figure 4). There are two ways of improving reliabihty. The first is to formulate and to

)2

John Sheahan

Candidate No.

Examiner

Discrepancy range

a

b

c

d

e

9

48

[30]

(55)

(55)

40

25

13

(67)

50

45

[42]

52

25

23

42

[35]

(6O)

58

47

25

34

(65)

52

[40]

60

55

25

47

32

36

35

(5S)

[30J

25

25

60

[32]

65

50

(68)

36

Mean score

52 3

39 2

50 0

53 3

48 7

FIGURE 4

Disparity in fxaiiiination marks in an essay question*

work to precise marking schemes. The second way is to arrange that there is just one possible answer to each question. This ideal is difficult to achieve when a student is required to produce the answer, but it is possible when a student is asked to recognize and choose the correct answer when it is provided, often mixed with plausible wrong answers. The implication here is that instead of the marker having to interpret and judge the student's answer, the student has to think of the answer he would have written down. The student then goes on to select from among the answers provided the one which best agrees with his opinion. A question which is written in such a way that the marker does not have to exercise any judgement, or use any subject matter knowledge, in deciding whether a student's responses are correct is called an objective item. Such items have a 100% mark and remark reliability. Unlike the marking of essay-type questions, which involves the process of assessment, the use of objective-type items is a form of measurement. Objective-type items can be standardized in a manner, using statistical techniques, which is not possible with other types of test questions. There are two important indices which help in forecasting the suitability of objective-type items. The first of these is the facility (difficulty) index which shows whether the candidates have found an item easy or difficult. The second is the discrimination index which shows how far an item distinguishes the high from the low scoring candidates. The concepts may bo more meaningful if considered in relation to an example. If 100 candidates attempt an item and only 10 get it right, this item would have a facility index of o-i. Similarly, if 100 candidates attempt another item and 90 get it right, then the facility index is 0-9. Although deahng with precise measurement, it is a matter of judgement whether such items would be included in a test. It is doubtful if they would because items with indices in the 0*3 to 0-7 range are more likely to be chosen. Similar imits are used in the calculation of the discrimination index. An ideal discrimination index is 0*5 and the acceptable range is similar to that for the facility index, that is, 0-3 to 0-7. * From Hartog & Rhodes (1936) with permission of the publishers.

Measurement iu nursing education

53

Before moving on it may be helpful to relate vaUdity and reUabihty. A test which is unreliable cannot be valid. On the other hand, if a test is liighly reliable it does not mean that it will be vaUd. It might be highly reliable at measuring something which the designers never intended. In a nutshell, tests can be rehable and invahd but they cannot be unreliable and valid. For a pictorial representation see Figure $.

Consistently hits the target in the same place. This is RELIABILITY,

How well it hits the centre of the target. This is VALIDITY.

FIGURE 5

A diagrammatic representation of reliability and validity

Administrative convenience Administrative convenience is a major factor in the process of evaluating learning. When content and predictive validity, and reliability can be assured through the use of objective tests, machine or unskilled markers may be used, thus keeping the administrative costs low. As well as keeping the costs low such a method readily provides data which may be analyzed for a variety of purposes such as research. When reliability is difficult to obtain, for example, from tests designed to provide evidence of learning in the higher cognitive domains or from the affective domain, the administrative costs tend to rise as each candidate's response has to be judged by an expert. This is the case in the word-based practical assessments in nursing. Psychomotor or manipulative skills must be observed on a one-to-one basis in order to be assessed. Similarly social or interactive skills, involving attitudes and thus in the affective domain, must be observed on a similar basis in order to be assessed. Reliability in this aspect of assessment may be improved in the case of manipulative skills by the use of check lists which have been drawn up by a group of professionals rather than by an individual. To improve reliabiUty in relation to attitudes is a more difficult task. What about using some of the many attitude

54

John Sheahan

measures available; The first objection to this is thit many are not as reliable and thus as scientific as the experts would like them to be. The second objection is that the costs and the administrative efforts of arranging attitude tests for all nursing students would be prohibitive. It is thus back to professionals making assessments of aspirants to the profession. But this is in no way unusual. Most people in their daily lives are concerned with assessing people: often they may have a direct purpose or interest in mind when they make these assessments. They might want to know how someone will vote in an election, how they might behave in a committee or whether they w^ill keep a secret, and so on. Some of these assessments may be based on intuition; others are based on common-sense knowledge evolved through personal experience. If nursing is an art, as well as a science, intuition cannot be excluded. If on the other hand, nursing wants to be a profession, its practices must have a scientific basis. Where science is found wanting as in attitude-measurement, for example, mature professional assessments based on common sense will have to suffice.

D O M A I N S A N D ABILITIES As indicated above, evaluation is a systematic process and as such depends on objectives being set out. Human abilities have been classified in domains and Bloom (1956), a notable contributor to the subject, set out objectives in the cognitive domain. Krathwohl (1964) dealt with those in the effective domain. Simpson (1977) has set out her ideas about objectives in the psychomotor domain. Steinaker & Bell (1975) have proposed a taxonomy of objectives in the experiential domain. These objectives and the aspects considered within each of the domains may be set out in tabular form as follows: When it comes to measurement, abihties in the lower end of the cognitive domain can be measured; in particular the memory aspects of learning. Further up the cognitive domain the ability to measure diminishes. The top item in the cognitive column in Table i is evaluation. Measurement of evaluation is an untenable proposition. Similarly the elements in the affective domain lend themselves to assessment rather than measurement. The elements in the psychomotor domain can be measured. This is the operational element of objectives. The student does something. What he does can be observed and measured. These measures can be judged in terms of the student's progress and against prescribed standards such as those acceptable in nursing. But what is acceptable in nursing? It is likely that many would agree that it is competence in practice, competence being taken as an individual's ability to produce agreed results efficiently and effectively. The minimum standard is safety in practice. The psychomotor area is of particular interest when it comes to measurement because the activities involved are readily observable. However assessment as well as measurement is involved. Consider the item 'admitting' from the list given in Table i. All the stages of admitting a patient to hospital can be set out on

Measurement in nursing education TABLE I

55

A lalmlar representation of jour domains

Cognitive domain

Evaluation Synthesis Analysis Application Comprehension Knowledge

Affective domain

Psychomotor domain

Experiential domain

Organization Conceptualization Valuing Responding Receiving

Weighing Testing Reassuring Preparing Measuring Lifting Explaining Dressing Connecting Comforting Assisting Assembling Admitting Administering

Dissemination Intemalization Identification Participation

a check list, for example. The items can be checked against the learner's performance and given an appropriate rating. However, there are elements in the process which are less easy to measure, such as explaining and reassuring. These are elements which cannot be separated from the physical tasks. If it is necessary to measure these aspects then what is expected of the performer must be stated. The difficulty arises because some people will be happy with a short, simple explanation; others will need more details. There are many other variations. But measurement demands that what is expected and what will be acceptable in every case must be specified precisely. When it comes to social skills such as those of explaining, which are often inseparable from some psychomotor skills, assessment is more appropriate than measurement. Some measurement is involved in evaluation, but some assessment is also involved: the first formulation of evaluation noted above in this paper. Safety in practice The concept of safety in practice as a standard for patient care can be criticized. For example, many nurses would subscribe to the concept of excellence in practice. It can be argued, therefore, that if excellence in practice is what is being aimed at, the baseline of safety in practice is too low a standard. The gap between the baseline and the ideal is too wide. But, when considered in the context of one's ability to measure, the virtues of the concept of safety in practice tend to cancel any weaknesses it may have. CONCLUSION Underlying all discussions about measurement in nursing education are two important and related concepts, and these are standards of patient care and entry

56

John Sheahan

to the nursing profession. And from these cmcr^;'' two very important questions. Can we measure? Should we measure; Yes, abilities in the lower order of the cognitive domain and in the psychomotor domain can be measured. It is less easy to measure abihties in the higher orders of the cognitive and the affective domain. And no attempt at measurement in the experiential domain has been made. should we measure? Rigorous measurement would presumably exclude the necessity for judgement. Or would it? Measurement can provide objective items bearing facility and discrimination indices. But whether an item with a facility index of o-i or 0-9 is used is a matter of judgement. Accurate facility and discrimination indices are not too difficult to achieve. But accurate measurements, having vahdity and rehabihty, are more difficult to achieve in other aspects of nursing which must be taken into account. Attitudes have been mentioned as an example. Then there is the administrative convenience to be taken into account in a programme of wholesale measurement. Another question concerns what is called positivism. This is the study of man within the framework of the natural sciences. Some see this as an inappropriate framework where human behaviour is concerned and reject positivism. Those concerned with nursing education should measure but should not confine themselves only to measurement. Nursing uses three approaches to testing. It uses formative tests such as practical nursing assessments which provide opportunities for learning as well as for assessing performance. Measurement and assessment is involved. There are sunimative tests in the form of objective items and essay-type questions. Those summative tests involve measurement and assessment. This combination has a lot to commend it but will, of course, have to be kept under review in the hght of research findings.

Acknowledgments I am grateful to Dennis Carroll, M.Ed, and to Ted Duggan, M.Ed., of the Faculty of Education, Huddersfield Polytechnic, who were unstinting with their ideas when I was preparing this paper.

References BLOOM B.S., KRATHWOHL D . & MASIA B . (1956) Taxaiwmy of Educatiotial Objectives: Cognitive Domain. David McKay, New York. HARTOG P. & RHODES E.C. {1936) The Marks of Examiners. Macmillan, London. KRATHWOHL D . R . & BLOOM B.S. {1964) Taxonomy of Educatioual Objectives: Affective Domain. David McKay, New York. REPORT OF THE COMMITTEE ON NURSING (The Biggs Report) (1972). H.M.S.O., London. SIMPSON E.J. (1977) The classification of cdLicational objectives: psychomotor domain. Illinois Teacher of Home Economics 10, 110-114. STEINAKER N . & I3ELL M . R . (1975) A proposed taxonomy of educational objectives: the experiential domain. Educational Technology, January, 14-16. TECHNICIAN EDUCATION COUNCIL {1975) Circular TEC 3/75. Technician Education Council.

Nursing education.

Microbiology Education in Nursing Practice.

Telehealth Education in Nursing Curricula.

Outsiders in nursing education: cultural sensitivity in clinical education.

Nursing education focus of nursing informatics research in 2013.

Nursing care quality measurement.

The Nursing Education Partnership Initiative (NEPI): innovations in nursing and midwifery education.

The future of nursing education.

Commitment to change in British nursing education.

education appointments in nursing.

Narrative in nursing practice, education and research.

Enhancing communication in clinical nursing education.

Massive open online courses in nursing education.

The influence of technology in nursing education.

Doctorate education and producing knowledge in nursing.

A new era in geriatric nursing education.

Social networking policies in nursing education.

Trends in nursing administration graduate education.

Virtual patients: development in cancer nursing education.

Legal issues in clinical nursing education.

Use of Action Research in Nursing Education.

Incivility as bullying in nursing education.

Innovation in critical care nursing education.

Measurement of stress in clinical nursing.