Electrocardiographic data compression via orthogonal transforms.

484

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. BME-22, NO. 6, NOVEMBER 1975

Electrocardiographic Data Compression Via Orthogonal Transforms NASIR AHMED,

MEMBER, IEEE,

PAUL J. MILNE,

Abstract-Electrocardiographic data compression via orthogonal transform processing is studied using canine ECG data. The Hiaar transform and the discrete cosine transform are considered. While the basis vectors for the Haar transform are sampled rectangular waves, those for the discrete cosine transform are sampled sinusoids. Experimental results show that a 3:1 data compression is feasible. That is, if N words are required to store an ECG in its original form, then only N/3 words are required to store it in terms of its'transform components. At this level of compression, the differences between the original and retrieved ECG data are not diagnostically significant.

I. INTRODUCTION THIS paper presents the results of an initial study to determine the feasibility of securing electrocardiograph (ECG) data compression via orthogonal transforms. The study was conducted using canine ECG data. Data compression leads to savings in the amount of memory required to store ECG's in digital form. Since there are over 70 and 160 million ECG's run yearly in the U.S. and world, respectively, the. problem of efficient storage and retrieval of

ECG data deserves attention. If each ECG signal requires N words to store it, then an m: 1 data compression implies that on the average, N/m words per ECG are required for its storage. Such compression must be realized essentially without loss of the features necessary for a physician to interpret the ECG. The effect of data compression with respect to ECG's has significant implications. It could lead to the economical storage-retrieval of ECG's in data banks, and hence enable an institution to have readily available ECG records from large numbers of patients.

II. SIGNAL REPRESENTATION CONSIDERATIONS If {X} denotes the set of vectors obtained by sampling a class of ECG's, then an element of {X}, denoted by X, can be represented as N

X=E

i=1

Yii =

where ' =

[0102

..

*N] O is an (N X N) transform matrix,

XT= [X1X2 .--XN] is a (1 X N) data vector, whose corresponding transform vector is

Manuscript received January 18, 1974; revised July 1, 1974, and

January 27, 1975.

N. Ahmed is with the Departments of Electrical Engineering and Computer Science, Kansas State University, Manhattan, Kans. 66506. P. J. Milne is with the Department of Defense, Fort George Meade,

Md. S. G. Harris is with the Department of Surgery and Medicine, Kansas State Univeristy, Manhattan, Kans. 66506.

MEMBER, IEEE, AND

STANLEY G. HARRIS

Without loss of generality, we assume that {X} is such that its mean vector is X is the null vector; that is, X = E {X} = 0 where E denotes expectation. If the set of basis vectors {q} are chosen to be orthonormal, then (1) yields'

x, i= 1,2, *-*N. Yi =0 TX(2 (2) Now, if M of the N components of Y are retained and an estimate of X is desired, then the remaining (N-M) components of Y are discarded. The corresponding mean-square error introduced is given by [11 e

N

2(M) = E oi Yzx o i=41+1

(3)

where I. = E(XXT) is the covariance matrix of {X}. It can be shown that the choice for 4' in (1) is optimum when the 4i are the eigenvectors of the covariance matrix E.,; the corresponding mean-square error (m.s.e.) is given by [1] e

2(M)opt

N

(4)

L; kg i=M+l

where the Xi are the eigenvalues of 1, The representation in (1) in terms of the eigenvectors of the covariance matrix I. is the discrete version of the Karhunen-Loeve expansion. The corresponding orthogonal transform in (2) is called the Karhunen-Loeve- Transform (KLT). III. VARIANCE CRITERION [2], [3]

Two important observations pertaining to the analysis in the previous section are as follows: (i) The KLT is the optimum transform for random signal the mean-square error criterion. representation with respect to(Y) (ii) Since (2) is of the form Y = AX where A is an (N X N) matrix, the transform covariance matrix is given by

ly = A XA-'

=A

AT.

(5)

When A is comprised of the eigenvectors of I, then

Fly = diag. [Xl, 2, * ^ *,' XN]

(6)

which implies that the KLT components yi in (2) are totally uncorrelated. 1 It is assumed that {44 consists of real-valued vectors; otherwise O. in (2) is replaced by its complex-conjugate.

AHMED et al.: ELECTROCARDIOGRAPHIC DATA COMPRESSION

485

N Since the eigenvalues are the main diagonal terms of SY, E 2 (9) e2b(M)= they correspond to the variances of the transform components i=M+i 2 , N. However, for an orthogonal transform A yi, i = 1,2, which is not the KLT, the transform covariance matrix 1, = where M =N/m. Equation (9) can be used effectively to comA XAT has non-zero off diagonal terms, which implies that pare the performances of suboptimal transforms since the partial decorrelation of the data is realized. The motivation optimum as well as the exact m.s.e. is given by (4). for considering such suboptimal transforms is that the comThe information in (7) can also be expressed in terms of putational problems associated with the KLT cause it to be the variance distribution, which is obtained by plotting the set impractical for large values of N. of normalized variances Now, from (4) it follows that the effectiveness of a KLT component for representing X is determined by the corN 2 6I2 =2£/ t (10) 'gti responding eigenvalue. If a component yi is deleted, then the i-1 mean-square error increases by Xi. Therefore, the component corresponding to the smallest eigenvalue should be deleted where Nt Q equals the trace of EY There are two reasons first, and so on. Conversely, if Y1iY2, ' *YM are the KLT for normalizing by the trace of Y (i) it is invariant with components with the largest variances, then they are retained respect to an orthonormal transformation, and (ii) it reprewhile the rest are discarded. The exact m.s.e. corresponding sents the variance energy of the signal. to this choice of KLT components is given by (4). This IV. DATA ACQUISITION [51 process of component selection is referred to as the variance ECG waveforms were recorded using the standard limb lead criterion, since the eigenvalues are also the variances of the system which consists of three leads. The data were separated KLT components. The variance criterion can also be applied into two classes, namely normal and abnormal. The abnormals to suboptimal transforms, as illustrated in what follows. showed ventricular defects that were induced by both chemiLet qu denote the variance of a suboptimal transform cal mechanical means. These ECG signals were then and component yi, i= 1, 2, -N. We rearrange the a2 in a dedigitized using a sampling rate of 400 samples per second, creasing order of magnitude and denote the resulting set by of the ECG signal was conwhich that the bandwidth implies aj2, i = 1, 2, * N That is Hz. to be less than 200 sidered 2 (72 > ^2 .>^ Each digital ECG was represented by 128 -samples. These were chosen such that the QRS complex and the T samples Similarly, corresponding to the set yi,i= 1,2,. ,N we wave would always appear. Thus if 128 samples would not obtain the set Yi, i = 1, 2, * , N. For example, if N 8 and span the entire ECG, those in the P wave portion were neglected. The resulting digital data were displayed using a {a2} = {15.64, 1.02, 2.44, 2.57, 1.54 0. 55, 0.02, 0. 1} yT = [YlY2Y3Y4YSY6Y7Y8] (7) CALCOMP plotter. Three hundred ECG's were then chosen of which half were normal and the other half were abnormal. then V. EXPERIMENTAL RESULTS [51 { 2} = f 1 5.64, 2.57, 2.44, 1.54 1.02, .55, 0.1,.02} the For purposes of discussion, we consider the discrete (8) cosine transform (DCT) and lead 1 ECG's. The overall data YT 9Y9293Y495Y697Y81 covariance matrix is computed as where Yi =Y1, Y2 =Y4, Y3 =Y3, Y4 =Y5, F, 2 [z1 [Ex++ IXX2 EX= where Ex and Ex are the covarianee matrices of the normal YS Y2, Y6 Y6, Y7 Y8 and abnormal classes of ECG's, respectively. The 150 normal and and 150 abnormal ECG data are used to compute Ex Subsequently the transform covariance matrix is computed as Y8 =Y7. y = AXATx The variance distribution that results from I theAA comThus if a specific compression is desired, say 2 :1, A is shown in Fig. 1. The area under each curve for a given ponents 1 ^2, 93, and 94 are retained, while y Y6, YI' of transform components is an indication of the number and 98 are discarded. In general an m :1 compression is of amount variance energy contained in those components. secured by retaining yi, i = 1, 2, . ,N/m and discarding the The area total under each curve is unity as a consequence of rest. The corresponding m.s.e. can only be estimated since the normalization by the trace. Thus it can be seen that the there is no closed-form expression for the exact m.s.e., as was for data compression is given by order of effectiveness the case with the KLT. This is because suboptimal transforms do not achieve total decorrelation of the data. Consequently KLT > DCT > IT the corresponding transform covariance matrix y =A YxAT has off-diagonal terms. However, ignoring the off diagonal where IT abbreviates the "identity transform" which is terms [41, one can estimate the m.s.e. for an m: 1 compres- defined as sion by computing Y=A-X -

Jo

y

^

Y4

.

J

Y

^'

A

Y

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, NOVEMBER 1975

486 O .36

3.30-

,,

0 .2 4-

IT DCT -~ KLT

- -----

2

ORIGINAL: 128 POINTS/ECG

a ,1 8-

0 .1 2-

DCT: 43 POINTS/ECG

06 -

0

15

30

45

60

75

90

10

120

30

COMPONENTS

Fig. 1. Variance distributions for the IT, DCT, and KLT. HT: 43 POINTS/ECG

where Ai is the (N X N) identity matrix. Thus the IT leaves the data as is. It is important to note that the IT can be viewed as the simplest orthogonal transform.2 From Fig. 1 it is apparent that almost all of the signal energy (i.e., area under the curve) is packed into about 45 DCT and KLT components. In contrast, the energy is essentially spread over a large number of the IT components. Based on this observation, a 3:1 data compression is considered. Thus 43 DCT components with the largest variances are selected to represent each ECG in the transform domain. The remaining 85 components are set equal to zero and the inverse DCT is then taken. CALCOMP plots of the ECG's so obtained are shown in Figs. 2 and 3 along with the original data which use 128 points per ECG. Again, the corresponding results obtained using the Haar transform (HT) are also included. Inspection of Figs. 2 and 3 shows that the information lost as a consequence of the 3: 1 data compression is not diagnostically significant. We remark that a premature beat could be lost in such reconstructed data. varation of m.s.e. The manner in which the m.s.e. varies for each of the transforms is summarized in Table I. In each case the m.s.e. is normalized by computing

Itr (2y)

e 2 (MoPt

Fig. 2. Original and reconstructed normal ECG's; 3:1 data compression. RABNORMALS

ORIGINAL: 128 POINTS/ECG

)

where e2(M)Opt and e2 (M)sub are defined in (4) and (9), respectively, and tr(ly) denotes the trace of the transform covariance matrix 2Y With respect to Table I, we make the following observations: (i) For increasing M, the errors for the KLT, DCT, and HT fall off more rapidly than the m.s.e. for the IT. (ii) The m.s.e. estimate for the DCT comes closest to that of the KLT, which is the best one can attain since the KLT is optimum. (iii) Values of M between 32 and 64 are reasonable candidates for attempting data compression via the DCT and HT. 2A discussion of the Haar and discrete cosine transforms may be found in [21 and [61, respectively.

!i

!i

11

' AS'

X,

li

..

DCT: 43 POINTS/ECG

1v

4'-' '

HT: 43 POINTS/ECG

Fig. 3. Original and reconstructed abnormal ECG's; 3:1 data compression. TABLE I

NORMALIZED M.S.E. FOR VARIOUS TRANSFORMS No. of Components retained, M

and e2 (M)sub Itr (T,Y)(

NORMALS

KLT - -

Normailzea DCT ---

m.s.e.

HT

-

I

IT

8

0.2332

0.2838

0.3404

0.8767

16

0.0773

0.1451

0.1763

0.7813

32

0.0034

0.0293

0.0603

0.6180

64

0.0000

0.0069

0.0088

0.3599

0.0000

0.0000

0.0000

0.1608

96

1

Thus, since N = 128, compression ratios from 2: 1 to 4: 1 may be considered. Storage Requirements Fig. 4 shows some of the steps involved with respect to an m: 1 data compression. If each sampled value of an ECG is stored using one word in storage, then the storage requirement is NK, where K is the total number of ECG's and N is the number of sampled values per ECG.

487

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. BME-22, NO. 6, NOVEMBER 1975 Select (-)

Sampled ECG

N-point

transform',

transform

com-

using the variance criterion

ponents

A

Storage edium

and human ECG's have similar characteristics, it is plausible that the method presented in this paper could also be used to study data compression of human ECG's. ACKNOWLEDGMENT The authors wish to express their indebtedness to Dr. W. W.

(a)

Koepsel and Mr. T. Natarajan of the Department of Electrical Engineering, Kansas State University, for their assistance in processing the ECG data used in the preparation of this paper. REFERENCES [11 H. C. Andrews, Introduction to Mathematical Techniques in Pat-

(b) Fig. 4. Pertaining to an m :1 data compression.

(b) Retrieval.

(a) Storage.

Using orthogonal transforms, we store N/m transform components per ECG. Each component is stored as a word in storage. Thus the total number of words required is NK/m, which implies that the storage requirements are reduced by a factor m. VI. CONCLUSIONS The experimental results presented in the last section demonstrate that it is feasible to use the variance criterion to secure data compression of canine ECG data. Since canine

tern Recognition. New York: John Wiley & Sons, 1972. [2] H. C. Andrews, "Multidimensional rotations in feature selection," IEEE Trans. on Computers, vol. C-20, pp. 1045-1051, Sept. 1971. [31 N. Ahmed and K. R. Rao, Orthogonal Transforms for Digital Signal Processing. New York/Berlin/Heidelberg: Springer Verlag, (in press). [41 J. Pearl, "Basis restricted transformations and performance measures for spectral representation," IEEE Trans. Info. Theory, vol. IT-17, pp. 751-752, 1971. [51 P. J. Milne, "Orthogonal transform processing of electrocardiographic data," Ph.D. Dissertation, 1973, Kansas State University, Manhattan, Kansas. [61 N. Ahmed et al., "Discrete cosine transform," IEEE Trans. on Computers, vol. C-23, pp. 90-93, Jan. 1974. [71 C. A. Careres and L. S. Dreifus, Clinical Electrocardiography and Computers. New York: Academic Press, 1970. [81 R. C. Balda, "Computer assisted ECG interpretation," Measuring for Medicine, vol. 7, May-Aug. 1972, Hewlett-Packard, Waltham, Mass.

Statistically Constrained Inverse Electrocardiography RICHARD

0.

MARTIN, MEMBER, IEEE, T. C. PILKINGTON, MEMBER, IEEE,

Abstract-This paper examines the feasbility of utilizing statistical constraints on the inverse potential model to determine the potential distribution over a 4 cm sphere surrounding the heart from perturbed torso potentials. These perturbed torso potentials reflect instrumentation, quadrature, electrode placement, and heart position uncertainties. This work is an extension of the authors' previous work which concluded that it is not feasible to determine this same potential distribution using unconstrained solutions. However, the results of the present work indicate that with the use of approximate signal and noise covariance matrices, it is possible to achieve estimates of this potential distribution with an average sum squared error of twenty-five percent. Further, the estimation of the signal and noise covariance matrices can be accomplished with a knowledge of heart geometry, torso geometry,

Manuscript received May 26, 1972; revised November 11, 1974, and March 24, 1975. This paper was supported in part by USPHS Grants HL 05716, HL 05372, and HL 11307. R. 0. Martin is with the Department of Electrical Engineering, Christian Brothers College, Memphis, Tenn. 38104. T. C. Pilkington and M. N. Morrow are with the Department of Biomedical Engineering, Duke University, Durham, N.C. 27706.

AND

MARY N. MORROW

the approximate measurement exror, and a rough estimate of the time an average section of myocardium is depolarized, but without an a prori specification of the activation sequence.

I. INTRODUCTION

ATTEMPTS at obtaining a physiologically meaningful solution to the inverse problem in electrocardiography have been numerous. Proposed models include dipoles [1]- [4], multipoles [5]-[9], and epicardial potentials [10]. This paper

is an extension of the epicardial potential model examined in our previous paper [10]. The same field theory developments are utilized to relate the torso potentials to the potential at any point on a 4 cm sphere surrounding the heart. The geometrical heart and torso data and the activation data used in the simulations are the same dog data previously described [11], [121 and utilized in subsequent work [101-[131. This paper extends the previous work by injecting an assumed

Analysis of ECG data, for data compression.

Data compression of large document data bases.

Serial 3 orthogonal lead electrocardiographic abnormalities after pulmonary embolism. Computer assisted study.

Orthogonal electrocardiographic study on progressive muscular dystrophy of the Duchenne type.

Age differences in the spatial vectors and vectorcardiogram utilizing various orthogonal electrocardiographic lead systems.

Hyperspectral IASI L1C Data Compression.

ECG data compression by modeling.

Orthogonal reference pattern multiplexing for collinear holographic data storage.

Sequence-defined polymers via orthogonal allyl acrylamide building blocks.

Layered compression for high-precision depth data.

Optical data compression in time stretch imaging.

Compression of structured high-throughput sequencing data.

High-throughput DNA sequence data compression.

EEG data compression with source coding techniques.

ECG data compression techniques--a unified approach.

Quantum data compression of a qubit ensemble.

QTc studies.

Electrocardiographic data processing: what does the future hold?

Literacy transforms speech production.

Fast generic polar harmonic transforms.

Light-weight reference-based compression of FASTQ data.

Dynamic CT perfusion image data compression for efficient parallel processing.

Data compression: effect on diagnostic accuracy in digital chest radiography.

Analysis-preserving video microscopy compression via correlation and mathematical morphology.