A Curriculum Database with Boolean Natural-Language Searching in HyperCard Doug Mann', Kenneth Goodrum', J. Michael DeWinel, John McVicker2 'Ohio University College of Osteopathic Medicine 233 Grosvenor Hall, Athens, OH 45701 ph. 614/593-2229 fax 614/593-9180 2Ohio Program of Intensive English Gordy Hall, Athens, OH 45701 ABSTRACT A curriculum database including both naturallanguage and keyword searching was developed to assist faculy in curriculum research and reform. HyperCard (with extensions) on the Apple Macintosh provides a flexible singk-user or networked environmentfor entering, indexing, searching and retrieving content in detailedfaculty notesfor the instructional activities in a four-year predoctoral curriculum.

exclusive reliance on this approach. MeSH is a large vocabulary that, due to its clinical emphasis, contains inadequate detail for indexing many types of basic science content. No controlled vocabulary to date includes every term that a faculty member might want to use in a search. Finally, a great deal of labor is involved in generating keywords for every instructional unit in a four-year curriculum. On the other hand, natural-language searching of biomedical information may produce a more complete set of "hits" than a keyword system [1,2]. In the hands of faculty familiar with the terminology of the subject matter, boolean natural-language searching should produce good results, and allow faculty to explore any topic mentioned in the curriculum. The committee decided to develop a system that would initially emphasize natal-language searching of complete faculty notes for all instructional activities, and include a field to add keywords as needed.

INTRODUCTION Faculty at the Ohio University College of Osteopathic Medicine (OU-COM) formed a committee in March of 1991 to develop a method to efficiently search the content of all four years of the predoctoral curriculum. In order to meet curriculm reform goals, particularly those involving the integration of basic and clinical sciences, faculty needed a way to browse the curriculum and locate the content concerning a given topic. The cufficulum database program in use at OU-COM since 1985 performs many adminstrative functions quite well, but it contains inadequate documentation of instructional content and has slow, unsophisticated searching features. The decision was made to create a new system that would incorporate rapid boolean (combinatorial) natural-language searching of all information relevant to each course or clinical rotation: syllabi, leaming objectives, complete faculty notes for each instructional activity (e.g., lecture, lab, rotation topic, case study, independent study), keywords, and administrative information. The project had several broad goals and specifications. Because it was to serve as a research tool for faculty, it needed to be fast and easy to use. The program should be able to hold as much detailed information as the faculty would be willing to provide, and preserve as much of the format of the original notes as possible. The program needed to be networkable to make it widely available to faculty. Faculty should be able to save and print search criteria and results. The question of whether to rely exclusively on a controlled vocabulary (i.e., keywords) for searching elicited considerable discussion. Curriculum database projects at several medical schools use combinations of MeSH and locally-developed controlled vocabularies [1]. However, there would be several drawbacks to an 0195-4210/92/$5.00 © 1993 AMIA, Inc.

SOFTWARE SELECTION Several database committee members had used Apple's HyperCard software for computer-assisted instruction and multimedia projects. HyperCard can store large amounts of text (up to 30,000 characters per field, with the number of records per file limited primarily by disk size). Beginning with version 2.0, HyperCard can store styled text, which enables it to retain the font size, style and emphasis (e.g., boldface and underlining) of a word processing document. Unfortunately, important features such as superscripts/subscripts and tab alignment are not supported in current versions of HyperCard. HyperCard "stacks" (files) can be shared on a network and accessed simultaneously by several users. HyperCard's searching capabilities had to be extended to meet project specifications. HyperCard can search rapidly for a single character string, but extensive "4scripting" (programming) is required to construct complex searching features, which substantially reduces searching speed. In the Fall of 1991, the committee purchased HyperKRS (from KnowledgeSet Corporation in Mountain View, California), a set of extensions that adds multi-stack word indexing and boolean searching features to HyperCard. HyperKRS can index every word in every field in up to 255 stacks. The searching engine accepts up to 20 terms using combinations of boolean operators and can operate across a network.

779

Clicking on the title of an activity in the hit list opens the course stack and activity card, and highlights a search term in context in a field (a hit in the notes automatically scrolls to the hit location). Pull-down menu commands permit rapid browsing of all hits for that activity as well as the other activities on the hit list.

SOFTWARE DEVELOPMENT Data from each of the courses and rotations reside in a separate stack, with one instructional activity per "card" (record). Database fields include: 1) instructor, 2) other involved faculty, 3) title of activity, 4) nature of activity (e.g., lecture, lab), 5) number of hours, 6) required or optional, 7) date of activity, 8) time of activity, 9) activity ID number, 10) course number, 11) year in cufficulum, 12) academic quarter, 13) date of last revision, 14) keywords, 15) components (major topics), and 16) notes. If the faculty notes for a single activity occupy more than 30K of space, the activity continues on a second card. The notes are held in a scrolling text field. Notes are prepared for entry using methods that reduce the labor involved and retain as much formatting as possible. MacLinkPlusIPC (DataViz, Inc.) is used to convert IBM-PC files to Macintosh format. These files as well as Macintosh notes files are transferred to the Microsoft Works word processor for final preparation. After course data is entered, HyperKRS is used to index the content of all course stacks. KnowledgeSet provides a standard card for previewing the contents of the index. On the search card, word roots can be entered and followed by an asterisk to capture variable word endings (e.g., "abus*" finds "abuse," "abusing," and "abuser"). The search for each term can be limited to a specific field if desired. The boolean operators "and," "or," and "but not" are used to specify the relations between terms. The proximity between terms can be specified to ensure that terms embedded in up to 20 pages of notes for a given activity are close enough together to be conceptually related. The sample search illustrated below locates instructional activities that mention the topic of substance abuse. The "hit list" contains the number of times the search criteria were met, the short name of the course (e.g., "NE" represents "Neural System"), and the title of each activity that met the search criteria. File Edit 6o Tools Objects Font Stgle NgpearKS Search For o or Red RRnd

2ubitafl

ins

rta

I__In _

_

or less words

ID

T elel Found:.------.--.--l--J ----

---

tilts/Cards

.

List cari titles

R9ctiv W: Ctsaoom LocUs - lad. We: I Reg: r 10/15/91 1_n: 1000 PID: B0S8 tiB:Ta: Keywords: Get 710 Phe 2 Fail Clr

Figure 2. Lecture found in "substance abuse" search. By clicking on any paragraph in the notes, the instructor name, activity name, paragraph number and paragraph contents are posted to the Notebook, where the information can be reviewed, edited and printed. The curriculum database is currently installed on several individual machines, but has been successfully tested for network use. In a networked arrangement, HyperCard and the search stack reside on the user's Macintosh, and the course stacks and indexes are on the file server. The slow speed of LocalTalk increases network search times, but running AppleTalk on Ethemet will restore multi-user searching to nearly the same speed as single-computer installations. CONCLUSION Boolean natural-language searching has been quite effective for locating instances of selected topics throughout the curriculum at OU-COM. HyperCard with HyperKRS provides a powerful method for naturallanguage exploration of large text-intensive databases.

j_nJ

[ Word

Neua Ssioem NSIipreIeamrsi 0,2em) T3TIt NE-A

PharmacokineticsoJat meteMineAll are well absorbed orally and are not highlU protein bound. There is some metabolism of amphetamine In the liver but most Is excreted in the urine. Excretion of amphetamine is vastly Increased In acidic urine. In contrast, most of methglphenidate Is hydrolyzed to ritalinic acid which Is ecreted in the urine. These drugs or used chiefly for their CN5 effects but peripheral side effects can be problematic. Dextroamphetamine (the disomer of amphetamine; A _aw*IM, methylphenidete and pemoline are preferred over amphetamine because they have more prominent CNS effets and less lslinificant peripheral actions when compared to amohetamine.

References 1. Mattem, W.D., Wagner, J.A., Brown, J.S., FisherNeenan, L. A computerized representation of a medical school curriculum: integration of relational and text management software in database design. In P.D. Clayton (ed.), Proceedings of the Fifteenth Annual Symposium on Computer Applications in Medical Care. NY: McGrawHill, 1992; 323-327. 2. Friedman, B.A. The impact of new features of laboratory information systems on quality assurance in anatomic pathology. Archives of Pathology Laboratory Medicine, 1988;1 12:1189-1191.

spurt

HHihilht Matches bt Card te e - Sexual ossut f r'' 1 HE - Local vssthstJcs ZA HE - Su ww _ I If - The CM depresst de (I of 2 cm ) 4 If - The CNS dsp ssswt dusC2 of 2 am I Hf - fhtidsd nts (1 of 2 co )1 OwZ - Reuts Cue of flntol Disardwrs (oard 1 of 2)

Hits

jI S

D.

drawbacks of the development of tolerance and the potential for drug have convinced most that the use of amphetamines in weight control as n longer valid. Dextroamphetamine, with greater CNS action and less peripheral action, Is generellU preferred to amphetamine.

rAin [ IIZl IAr

_In

_In

V And

Cen

Z Z

in O1^

_

rlac CowenOolin

L is t J.It

I

Figure 1. Results of "substance abuse" search.

780

A curriculum database with boolean natural-language searching in HyperCard.

A curriculum database including both natural-language and keyword searching was developed to assist faculty in curriculum research and reform. HyperCa...
568KB Sizes 0 Downloads 0 Views