Acta Psychologica 0 North-Holland

39 (1975), 131-139 Publishing Company

CONCEPT IDENTIFICATION

WITH NATURAL

MATERIAL

Tarow INDOW, Michiko KOBAYASHI and Sayoko DEWA Department

Received

of Psychology, Keio University, Minato-ku, Tokyo, Japan

October

1974

Phonetic letters in Japanese (Hiragana, 23 in number) written by 6 adults (Ws) were presented one by one and the S was asked to guess by which W it was written. Immediately after the S made the guess, the code of correct W was given on each trial. The Ss were Japanese housewives, students etc., 53 in number. Rate of correct identification increased during 23 X 6 trials, but only from 0.3 to 0.4 on the average. The learning transferred to the remaining 23 letters of the same kind but not to letters of the other kind (Katakana). On each trial, the S verbally described cues upon which the guess was made. When the data were separately analysed according to Ws, it was found that letters of the W who was most difficult to identify were most ‘diffused’ in characterization by the Ss. The relevance of the present experiment to study of concept identification in general was also discussed.

Most studies of concept identification have used rather artificial stimuli in experimentation, such as geometrical figures, strings of letters etc. These stimuli have distinctive attributes, e.g., presence or absence of particular letters, and are useful in extracting strategies being used by the S in the process of attaining the concept (e.g., Indow and Suzuki 1972, 1973). However, it will remain open to question whether the process in this situation is essentially the same as, or qualitatively different from, the process by which a child attains, for example, the concept of ‘dog’ through experience in the early period of its life. The situation is analogous to the relationship between experimental study of memory using nonsense syllables and, for example, study of what has been called lexical memory by Miller (1972). For the purpose of filling the gap between the concept identification with artificial stimuli and formation of natural concepts in real life, an experiment was conducted with handwritten letters. When we receive a card from a close friend, we usually’have no difficulty in telling at a glance from whom it came.

T. Indow et al./Concept

132

identification

with natural material

In order to simulate the process making the identification possible in real life, letters written by unknown persons were presented to Ss, and they were trained, in the main experiment, to identify the writers under the same circumstances as in concept identification experiments of an ordinary type.

Method Stimuli

and subjects

Each of the 6 adults (A to F) wrote 46 Japanese phonetic letters in Hiragana (H) style as well as in Katakana (K) style. Examples are given in fig. 1. Each letter was written in black ink in a designated space of 1.5 X 1.5 cm on a white card 9.0 X 5.5 cm in size. In the following 4 experiments, I to IV, 53 Ss were employed: housewives, white-collars, and university students. All the experiments were carried out individually and it took about 2 hours per S to complete all four.

Experiment

I

The stimuli consisted of 23 H letter cards from each of the 6 writers (Ws). The remaining 23 H letters were saved for Exp. IV, for the purpose of testing possible transfer effect of the training. The S was asked to classify 138 (23 X 6) cards into 6 piles on the table according to IV. A set of 6 cards at a time of the same letter was handed to the S, with the instructions that these were written by 6 adults, A to F. There was no time limit and, when the S distributed the cards into 6 piles by placing each card face up, 6 cards of another letter were handed. In this way the experiment was continued, until 23 cards had been placed in the respective piles.

K (ki) H

K (ho) H

Fig. 1. Examples of H and K letters (ki) and the lower (ho).

written

by the 6 writers.

The upper

letter

is pronounced

T Indow et al. /Concept

identification

with natural material

133

Experiment II This was the main experiment, and the stimuli consisted of 23 K letters from each of the 6 Ws. These 23 K letters were chosen independently from the set of 23 H letters used in Exp. 1. The S was asked to identify W when the cards were presented one by one and, immediately

Increasing 3 6 IW:C)

Deteriorating 10 53 (s: 37)

No. change 13/ 53 (S: 30)

Improving 30/53 (S: 22)

Grand lllKI”S (Ato F, 53 S’s)

:90,

114

n : Trials

Fig. 2. Rates of correct identification of all the experiments. the correct writer after haying made a guess on each trial.

In Exp. II the S was informed

of

134

T. Indow et al.fConcept

identification

with natural material

after the S responded with one of the codes A to F, the correct answer was given. The first 6 presentations comprised cards of randomly chosen letters, one from each of the 6 Ws. From the seventh trial on, letters and Ws were randomized and the S was encouraged to tell, in detail, upon what characteristics of the letter he had made the guess. The experiment was completed when all the 138 cards had been presented in a semi-random order. A lengthy run of the same writer was deliberately avoided.

Experiments

III and IV

In order to test the transfer effect of the were included. Classification of cards without out in exactly the same manner as in Exp. I remaining 138 K cards which were not used cards which were not used in Exp. I.

training given in Exp. II, these two experiments information feedback about the lVs was carried with the following sets of cards; in Exp. III the in Exp. II, and in Exp. IV the remaining 138 H

Results Rates of correct identification of W are plotted in fig. 2. Circles p(n) plotted against trial number n in the abscissa represent the results of Exp. II: the averages of 12 consecutive trials. For example, the circle on the 18th trial represents the average rate of correct identification from the 7th to 18th trail. The rate was not defined on the first 6 trials. The results of Exps. I, IV, and III are indicated by crosses and triangles, respectively. In Exp. I, the Ss had no idea about which pile of cards, when the classification was completed, corresponded to which of the Ws. Hence, correct identification in this case was defined as follows. With each of the 6 piles, the W whose cards were the largest in number in that pile was defined to be representative, and the proportion of those cards was taken as the rate of correct identification. In Exps. III and IV, the Ss also formed 6 piles of cards. However, they could tell in these cases which pile represented which W, and the rate of correct identification was defined in a straightforward way. The thick curve at the bottom represents grand means over Ss and Ws. Clearly, p(n) exhibits an increasing trend during the training phase in Exp. II. However, the slope is rather small, as will be discussed later. From the comparison of the triangle and circles, it will also be clear that the training with the 23 K letters in Exp. II brought about the same improvement in the identification with the remaining 23 K letters presented for the first time in Exp. III. On the other hand, no appreciable change was observed between two crosses at the furthest left and right ends, which implies that the improvement in identifying Ws of K letters had no transfer effect at all when Ws of H letters were guessed in Exp. IV. It will be noticed, however, that p(n) was considerably larger for H letters than for K letters in general. As is clear from fig. 1, H letters are more round and curvilinear than K letters. By inspection, it was easy to classify individual results of the 53 Ss into three different types: 30 Ss showed improvement, 13 Ss stayed at about the same levels, and 10 Ss showed rather decreasing trends in p(n) in Exp. II. As examples, three individual curves, one from each type, are given in fig. 2 (the second to the fourth from the bottom). Of these curves, each circle represents the rate of correct identification over 12 cards on the n-l 1 and nth trials. When p(n) was defined separately with regard to each W, for the cards of 5 Ws (B to F), p(n) exhibited more or less increasing trends, whereas p(n) was almost stationary for the cards of A. Two

K Indow et al.fConcept

identification with natural material

135

examples of individual data (A and C) are given at the top of fig. 2, where each circle represents the rate that the W was correctly identified by the 53 Ss from the n-l 1 to nth trials. As mentioned before, the S described, on each trial n (> 6) in Exp. II, characteristics of the handwritten letter that were used as cues in inferring the W. Characteristics referred to by the by the 53 Ss Ss were ‘round’, ‘strong’, ‘beautiful and delicate’, etc. Of each W, cues mentioned throughout the whole trials were counted, so as to determine the frequencyfof each characteristic. Then the characteristics were ranked in terms of ffrom ry)=l on, in decreasing order. In other words, r(j)=1 denotes the characteristics most frequently mentioned by the Ss in describing K letters written by the IV. In fig. 3,fis plotted against r(f). The plotting is useful in making explicit the difference between characteristics which play, to the eyes of all the Ss, the role of general factors throughout all the K letters written by the IV, and characteristics which have been fortuitously mentioned by some Ss with reference to some K letters of the W. The same type of plotting is being used in quality control under the name of Pareto curve with reference to causes for defect, with the purpose of disclosing the most responsible causes. Zipf (1965) used a similar plotting with regard to counts of words in a given text. In fact, the relationship was almost linear in the log f-log rU, plot, which is widely known as Zipf’s law. Because plotting of fagainst rU, yielded essentially the same pattern for writers B to F, and a distinctively less steep curve for A, the average curve f.n- B to F and the individual curve for A are shown by circles and crosses, respectively, in the upper part of fig. 3. Notice that characteristics falling under rU, differed, as a matter of course, from one IV to another and hence, for example, rU, = 1 in the average curve denotes different characteristics for B to F. Of these overall curves, the distinction between characteristics more frequently referred to and characteristics less frequently referred to is larger for B to F than for A, which corresponds to the fact, shown in fig. 2, that rates of being correctly identified increased in the course of training for the 5 IVs and remained almost constant for A. The plottings in the middle part of fig. 3 were obtained in the following way. Only cues mentioned when a S correctly identified a IV were counted. And first, of each IV, the frequency of each cue fi and its rank rCfi) were determined separately for the respective Ss. Next, the mean of fi over the Ss was denoted by f, and f was plotted against the rank. Notice that characteristics falling under r(j) were not the same from S to S. Again, the curves thus obtained exhibited the same pattern for B to F and a somewhat different form for A. What are plotted in the middle part are the average curve, for B to F, and the individual curve for A. It will be clear that the cues at the higher rank of frequency, r(f) = 1, 2, led to correct identification less frequently in A than in the remaining 4 Ws, which is also in agreement with the finding obtained with the overall curves. It seems worth mentioning that cues contributing to correct identification in each S were less than 6 in number. The curves at the bottom should be read as follows. First, of each IV, only the cues corresponding to rCfr) = 1 were taken into account for the respective Ss. In other words, the sort of cue was collected from each S that was most frequently referred to when the IV was correctly identified by the S. However, the sort of cue, if any, was discarded when referred to by the S in correctly identifying the IV only once throughout the course, even when it was most frequent and corresponded to r(fi) = 1. Cues thus collected over the Ss were ranked according to the number of Ss, and f, the number of Ss, was plotted against the rank r(J). For example, the first cross of the curve indicates that there were 6 Ss who were led to correct identification of A by the characteristic corresponding to rU, = 1. The plotting was truncated at r(f) = 20, though both cross and circle had long tails to the right. A marked difference between and A and the other 4 Ws wilI be noticed, and the cues most suggestive of IV for the respective Ss were more widely ‘diffused’ over the Ss for A than for the remaining Ws. Rates of correct identification of each W by the Ss on the basis of the respectively most suggestive cues used in the above stated plotting in fig. 3 are plotted against trial number n in

T. Indow et al. / Concept identification with natural material

136

@

x

c, 15 .. z Fz s 2 L 5

Average data from B to Individual of A

F

data

-1 Overall ,

~._X._X__.X__.X -1

4

\

3 2 Correct identification

1 0 /

ia

l-

Most

8

frquently in correct

used identification

6 4 2 (

0

Rank Fig. 3. Results concerning details see text.

of Frequency:

cues by which the Ss reported

r(fl to have made identification.

For more

137

T. Indow et aLlConcept identification with natural material

0.5 g ._

s

0.4

2 2 5

0.3

z t

0.6

z vz

0.5

2 m u t P

w:c

0.4

0.7 0.6 Grand

means

(A to

F)

0.5 /ifi/““““’ 18

I”‘)

42



90

66

*

j

15

114

I

138

Trials

Fig. 4. Improvement of identification, respective writers for each S.

which

was based

upon

the most

suaaestive

cues of the

the same manner as in fig. 2. The average curve for all the Ws and two individual curves (A and C corresponding to the two examples in fig. 2) are shown in fig. 4. Of the average curve, the slope is in the same order of magnitude as the corresponding average curve at the bottom of fig. 2. However, the rates are at a higher level here because only correct identification based upon the most suggestive characteristics of the respective Ws have been taken into account. Again, no increasing tendency was observed with the curve of A as in fig. 2, whereas the curve of C is about 4 times steeper here than in fig. 2, which implies that the learning is more marked with the most suggestive cues.

Discussion From the everyday experience that we can easily identify handwritings of our friends, the authors expected that identification of W would be quickly learned. Actually, however, the improvement appeared to be gradual and rather slow, so that no S achieved perfect identification of any W within the 138 trials. If linearly extrapolated, it will be inferred that 480 trials would be necessary on the average until K letters by the 6 Ws were discriminated without confusion. One reason for the slow learning consists in the circumstance that individual letters were used as

138

T. Indow

et al/Concept

identification

with natural material

stimuli rather than a series of letters. When we can tell at a glance from whom a card came, the information seems mainly to be conveyed by a continuation of letters in the card. In the case of concept identification with artificial stimulus, where presence or absence of a particular letter in a string of 5 letters was the (one-dimensional) criterion in defining the concept, the Ss could be divided into several groups according to strategies of how to find out the criterion, and in the group of Ss who adopted the most efficient strategy the concept was attained within 5 trials as the latest, and even in the group of a much less efficient strategy most Ss attained the concept within 20 trials (Indow and Suzuki, 1972, 1973). When the criterion was of conjunctive form (e.g. A n B) the learning was quicker, whereas when the criterion was of disjunctive form (e.g. A U B) the learning became far more difficult, of which report will soon be made elsewhere (Indow et al. 1974). As a matter of course, the difficulty of concept learning depends upon a number of factors; number of relevant and irrelevant attributes or dimensions, number of values in each dimension, distinctness of these values and dimensions, type of criterion, number of concept categories, and probability of information feedback, etc. The structure of the above-stated experiment with alphabetical letters can be described as follows; 10 dimensions in total, 2 values in each dimension (presence and absence), extremely distinct, single letter and conjuction or disjunction of two letters, 2 concept categories, and 100% feedback. Some experiments with artificial stimulus have shown that the concept was more difficult to attain than reported above. For example, according to Wandmacher and Vorberg (1974) who used geometrical figures as stimuli: in order to attain conjunctive concepts of 2 values (up to 5 dimensions in total, 2 values in each dimension, very distinct, 4 concept categories, and 100% feedback), it took about 26, 40, and 94 trials on the average, according to the number of irrelevant dimensions ( 1, 2, and 3). The structure of the present experiment is not easy to specify in contrast to situations with artificial stimulus. Of the factors enumerated above, only two factors can be clearly specified in the present experiment: 6 concept categories and 100% feedback. It was pointed out before that characteristics of handwritings that are actually taken into account in correct identification of W are presumably not very large in number. However, dimensions and values are neither separable nor distinct because it seems unlikely that all the cues for a W are present in all

I. Indow et al/Concept

identification

with natural material

139

the letters written by the W. When the S referred to a set of cues in guessing W of a letter, the description was usually of conjunctive form. However, the cues were not quite consistent throughout letters of the same W, and the criterion, if defined over the whole trials, would be of disjunctive form. As mentioned before, disjunctive concept is most difficult to attain. Furthermore, characteristics of handwritten letters which correspond to dimensions and values are much larger in number as shown in the overall curves at the top of fig. 3 than dimensions and values in artificial stimuli. All concepts to be obtained in everyday life will have these circumstances in common.

References Indow, T., S. Suzuki, 1972. Strategies in concept identification: stochastic model and computer simulation I. Jap. Psychol. Res. 14, 1688175. Indow, T., S. Suzuki, 1973. Strategies in concept identification: stochastic model and computer simulation 11. Jap. Psychol. Res. 15,1-V. Indow, T., S. Dewa, M. Tadokoro, 1974. Stratiegies in attaining conjuctive concept: experiments and simulations. (To appear in Jap. Psychol. Res.) Miller, G.A., 1972. Lexical memory. Proc. Amer. Phil. Sot. 116, 140-144. Wandmacher, J., D. Vorberg, 1974. Application of the Bower and Trabasso theory to fourcategory concept learning with probabilistic feedback. Acta Psychol. 38, 205-213. Zipf, G.K., 1965. The psychobiology of language: an introduction to dynamic philology. Cambridge, Mass.: MIT. Press.

Concept identification with natural material.

Acta Psychologica 0 North-Holland 39 (1975), 131-139 Publishing Company CONCEPT IDENTIFICATION WITH NATURAL MATERIAL Tarow INDOW, Michiko KOBAYAS...
505KB Sizes 0 Downloads 0 Views