ECG data compression by modeling.

ECG Data Compression by Modeling I. S. N. Murthy

Budagavi Madhukar

Biomedical Lab, Department of Electrical Engineering Indian Institute of Science Bangalore - 560 012, INDIA ABSTRACT

This paper presents a novel algorithm for data compression of single lead Electrocardiogram (ECG) data. The method is based on Parametric modeling of the Discrete Cosine Transformed ECG signal. Improved high frequency reconstruction is achieved by separately modeling the low and the high frequency regions of the transformed signal. Differential Pulse Code Modulation is applied on the model parameters to obtain a further increase in the compression. Compression ratios up to 1:40 were achieved without significant distortion.

INTRODUCTION Data compression methods for Electrocardiogram (ECG) signals have been playing an important role in computer processing and analysis of ECG. The major reason for going in for a higher compression ratio (CR) has been the desire to obtain higher density storage in medical databases and hospital information systems. Another area wherein the need for efficient data compression for ECG has been felt is Ambulatory ECG Monitoring (AECGM). AECGM is usually done using the conventional Holter monitor which consists of a 24 hour cassette recording of the ECG. Modern Holter monitors with digital IC memory cards are now expected to improve fidelity of recording and make the system more compact. But due to the limited capacity of the IC memory cards, the sampled ECG data generated during the 24 hours has to be first compressed before it can be stored digitally. Another application which has been proposed recently is compression of ECG data for storing (along with other patient data) on a medical smart card. These and many other applications demand data compression algorithms with very high CRs. Existing ECG data compression techniques have been classified into a) Direct data handling and b) Transformation techniques. The direct data handling

0195-4210/92/$5.00 01993 AMIA, Inc.

techniques achieve data compression by removing the redundancies present in the actual ECG signal samples. Techniques such as AZTEC[1], CORTES[2], SAPA[3] and DPCM and Entropy coding[4] come under this category. In contrast, the transform domain techniques achieve data compression by constraining their basis functions. Many discrete orthogonal transforms such as KLT[5], DCT[6], FT[7] have been used for ECG data compression. The algorithm presented in this paper comes under a third category of Parameter extraction techniques. Here a unique set of parameters characterizing the input signal is found. Data compression is achieved in the process as the input signal frame is represented by a smaller set of parameters. In [8] an algorithm was presented for modeling and delineating the ECG. The algorithm consisted of modeling the Discrete Cosine Transform of the ECG by the Steiglitz McBride (SM) method[9]. The signal frame of length 400 samples was represented by a model with 20 parameters. Direct application of the algorithm resulted in a very high CR, however it failed to model the clinically significant Q and S waves. This happens mainly because the low frequency region of the transform has a very high amplitude while the high frequency region has a comparatively lower amplitude. As the modeling emphasizes the high amplitude region more, the low amplitude region which is crucial for the reconstruction of small significant components such as the Q and S waves and the QRS notches get neglected. This is a consequence of the global error criterion minimized by the SM algorithm. We overcome this problem by separately modeling the high amplitude low frequency (HALF) and the low amplitude high frequency (LAHF) regions of the transformed signal. Improved performance is achieved as the HALF and the LAHF regions are decoupled and modeled effectively. ALGORITHM

The block diagram of the algorithm is shown in Figure 1. The algorithm consists of the following four

586

s(n)

Retrieve Quantized

Input ECG

SHALF(Z) (Parameters)

SLAHF(Z) (Parameters)

Parameters

Reconstructed ECG

Figure 1: (a) Modeling and Quantization of the input ECG

Figure 1: (b) Reconstruction of the ECG from the quantized parameters

steps: (a) Transformation, (b) Modeling, (c) Quanti-

where Bo is the gain and mated HALF model.

zation, and (d) Reconstruction.

p

is the order of the esti-

Transformation The discrete cosine transform (DCT) of the given frame of the discrete time ECG signal s(n) of N samples duration is computed using

SLAHF(Z)

=

=

1()

Do

1+Diz-l+... +Dqz-q 1+ C1z' + ... + CqZ

(4)

where Do is the gain and q is the order of the estimated LAHF model. The parameters Bo, B1,..., Bp, A1, A2, ..., Ap, Do) D1, ..., Dq, Cl, C2, ..., Cq.

1N-1 S(O)

=

(1)

of the

SHALF(Z)

and

SLAHF(Z)

models

are

estimated

using the iterative SM method.

E s(n) cos

S(k)=

2n+l

SHALF(Z) and SLAHF(Z) uniquely characterize the DCT of the ECG and consequently ( by taking the inverse DCT ) the input ECG itself.

(2)

where S(k),k = 0,1,...,N- 1 is the DCT of the input sequence s(n). Similarly the inverse DCT of S(k) can be computed to get back s(n).

Criterion for splitting the DCT. Let

{SHALF(k)} = {S(k)} k = O, ..., m (5) = k = m + 1,... ,(N-1) {SLAHF(k)} {S(k)}

Modeling The transformed signal S(k) is split into two

se-

We empirically found that the following criterion provides satisfactory results: Choose the smallest value of m for which

quences, the high amplitude low frequency (HALF) sequence, SHALF(k) and the low amplitude high frequency (LAHF) sequence, SLAHF(k). These two sequences are then approximated as the impulse responses of the unknown models SHALF(Z) and

SLAHF(Z).

Let

SHALF(Z)

=

Blz-1 ++ BoB11 ++ Alz-1

+ pz

...+

Ap,z-P

3)

max(ISLAHF(k)

I)

l maX(ISHALF(k)l)

0.1

(6)

where max() is the maximum operator. Calculation of m is not critical as the DCT of the ECG does not change abruptly. Hence an error in estimation of m of up to ±20 samples can be tolerated.

587

Quantization The model parameters Ai's, Bi's, Ci's, and Di's are very sensitive to quantization and round off errors. Even small quantization errors in these parameters lead to very large changes in the impulse response and hence in the reconstructed ECG. To avoid these problems the two polynomials in Equation 3 and Equation 4 are factorized to obtain their roots. The roots of numerator polynomials are called the zeros while those of the denominator are known as poles. These poles and zeros are then quantized and stored in the polar form as magnitude r and phase 0. In addition the gain constants are also quantized and stored. The poles and zeros usually occur in complex conjugate pairs, so only half of them need be stored. In some cases two of the zeros were found to be real, so an extra bit of information to distinguish the real zero is stored. We found that for faithful reconstruction of the ECG waveform the number of bits required to store the zero, the pole, the HALF gain and the LAHF gain were 24, 22, 14, and 7 bits respectively.

Reconstruction The block diagram of the reconstruction process is given in Figure lb. The impulse response of the reconstructed LAHF model is appended to that of the reconstructed HALF model. The IDCT of this appended sequence gives us the reconstructed ECG.

FURTHER INCREASE IN CR BY DPCM OF THE MODEL PARAMETERS

Further increase in the CR is achieved by exploiting the beat to beat similarity in continuously recorded ECG such as the one obtained in AECGM. By choosing beat synchronous frames the variability in the locations of the model poles and zeros of adjacent frames is reduced. Therefore we can apply the principle of Differential Pulse Code Modulation (DPCM) of the model parameters to enhance the CR, as in [10]. In DPCM, the transmitted parameters corresponding to the signal frame of interest is expressed as a linear combination of the parameters in previous frames. We applied single frame DPCM with encouraging results. The number of bits used for coding the poles reduces to 9 bits from the initially required 22 bits and to 10 bits from the initially required 24 bits for zeros. Similarly the two gains can be coded with 2 bits each, instead of the initially required 14 and 7 bits. An additional 9 bits per frame is required to store the frame size. However we found the number of bits required to code the parameters (obtained after the application of DPCM) to depend on the base line shift present in the signal and the heart rate variability.

CRITERIA USED FOR THE PERFORMANCE EVALUATION OF THE ALGORITHM To evaluate the performance of our algorithm we adopted besides visual comparison, the popularly used quantitative measure, the Normalized Root Mean Squared Error(NRMSE) (also known as percent rms difference (PRD)[2]) defined as:

%NRMSE =

1(sn -X100% 1'(n)2 [L~n=

Analysis of ECG data, for data compression.

ECG data compression techniques--a unified approach.

Vector quantization for compression of multichannel ECG.

EP-based wavelet coefficient quantization for linear distortion ECG data compression.

Homomorphic analysis and modeling of ECG signals.

ECG data acquisition. A discussion.

Adaptive Modeling Procedure Selection by Data Perturbation.

Compression of the ambulatory ECG by average beat subtraction and residual differencing.

A Novel ECG Data Compression Method Using Adaptive Fourier Decomposition With Security Guarantee in e-Health Applications.

Data compression of large document data bases.

An optimized compression algorithm for real-time ECG data transmission in wireless network of medical information systems.

Hyperspectral IASI L1C Data Compression.

Modeling quasi-periodic signals by a non-parametric model: application on fetal ECG extraction.

Layered compression for high-precision depth data.

Electrocardiographic data compression via orthogonal transforms.

Optical data compression in time stretch imaging.

Compression of structured high-throughput sequencing data.

High-throughput DNA sequence data compression.

EEG data compression with source coding techniques.

Quantum data compression of a qubit ensemble.

Modeling Transient Disconnections and Compression Artifacts of Continuous Glucose Sensors.

Prediction of the biomechanical effects of compression therapy by finite element modeling and ultrasound elastography.

output function compression: comparisons with hearing thresholds.

Correlation modeling for compression of computed tomography images.