Abstract The advent of the big data era creates both opportunities and challenges for traditional Chinese medicine (TCM). This study describes the origin, concept, connotation, and value of studies regarding the scientiﬁc computation of TCM. It also discusses the integration of science, technology, and medicine under the guidance of the paradigm of real-world, clinical scientiﬁc research. TCM clinical diagnosis, treatment, and knowledge were traditionally limited to literature and sensation levels; however, primary methods are used to convert them into statistics, such as the methods of feature subset optimizing, multi-label learning, and complex networks based on complexity, intelligence, data, and computing sciences. Furthermore, these methods are applied in the modeling and analysis of the various complex relationships in individualized clinical diagnosis and treatment, as well as in decision-making related to such diagnosis and treatment. Thus, these methods strongly support the real-world clinical research paradigm of TCM. Keywords
big data; real world; clinical research; Chinese medicine; medical computing
Introduction Researchers in the ﬁeld of data science have recently advocated the value of data and the signiﬁcance of data mining. In particular, Nature and Science have published special issues regarding big data [1,2]. These special issues demonstrate the universality of big data in all scientiﬁc ﬁelds and the requirements of data processing from a new and high perspective that ushers in the era of big data. Similarly, big data is in demand in the ﬁeld of traditional Chinese medicine (TCM). Speciﬁcally, big data is required by the clinical research-sharing systems established at the clinical research base of TCM in China . This base was organized by the China Academy of Chinese Medical Science. Moreover, this kind of data is necessary in the electronization of hospital information systems and in Internet advancement . Related theory and technical support should be provided to develop the national data center of TCM . The actual clinical research paradigm of TCM  has generated much attention because it responds to current clinical research dilemmas and predicts the development trends of TCM and of medical science as a whole. This paradigm is published by the journal Traditional Chinese Medicine.
The paradigm of real-world clinical scientiﬁc research is “data-oriented” and “combines medical practice and scientiﬁc computation.” Therefore, data computing is necessary in TCM. Big data collection, management, analysis, and demonstration have been applied successfully in the ﬁeld of TCM, along with techniques and methods such as complexity , intelligence , data , and computing sciences. Thus, this study systematically illustrates the concept, connotation, value, research ﬁelds, and calculation methods of scientiﬁc computing in TCM based on these methods and as guided by the real-world clinical research paradigm.
Discipline of scientiﬁc computing in TCM The real-world clinical research paradigm of TCM is humancentered, data-oriented, and problem-driven. It alternates medical practice with scientiﬁc computation and integrates clinical practice with scientiﬁc research . This paradigm is derived from the typical model of TCM research and combines concepts, theories, and techniques, including modern clinical epidemiology, evidence-based medicine, statistics, and complexity, intelligence, data, and information sciences. A “data-oriented” perspective is both the key in and is the premise of the real-world clinical research paradigm of science and technology. Therefore, all kinds of real-world clinical diagnosis and treatment information must be
collected comprehensively and converted into ancient, modern literature, future clinical, and experimental biology data, such as those regarding genomics, proteomics, and metabolomics and those related to human body health during daily living according to the Internet. The generalization and consolidation of these data opens up a new TCM clinical research prospect that is supported by big data. The “data-oriented” perspective signals the inevitable transition to TCM clinical research, to the organic combination of western medicine and TCM, to the key advantages of complementary technology, and to the premise of the real-world clinical research paradigm of TCM. This paradigm mainly advocates the “alternation of medical practice with scientiﬁc computation.” It is an essential and contemporary “clinic-to-clinic” approach. Thus, scientiﬁc computation is a useful tool in the big data era. It supersedes humans in terms of remembering and analyzing the data regarding clinical medical practice to a certain extent. Furthermore, it can determine laws and knowledge accurately and comprehensively. However, the computation results should be veriﬁed clinically and practically; as a result, it should be alternated with medical practice. The “data-oriented” premise and key technology and the main form of “the alternation of medical practice with scientiﬁc computation” highlights the demand for data, information, intelligence, complexity, and computing sciences in real-world clinical scientiﬁc TCM research. To address problems related to data acquisition, analysis, management, and validation, TCM disciplines must be related to TCM characteristics. Rather than applying these techniques simply and separately, an interdiscipline subject is necessary for two reasons: ﬁrst, a clear understanding of the characteristics of the data regarding TCM theory and on technical problems; and second, in-depth research in the ﬁelds of data, information, intelligence, and complexity science and technology. Therefore, applicable theory and technology can be developed and proposed based on data analysis demand to conduct real-world clinical scientiﬁc research on TCM. The scientiﬁc computation of TCM caters to these requirements and can be applied in two ways: in the clinical research of complexity, intelligence, data, and information sciences, and incorporated into related computing subjects on conceptualization, theory, and knowledge. Scientiﬁc computing can also be integrated into speciﬁc TCM techniques that in turn enhance the efﬁciency and the level of clinical research. The former is limited to the technological aspect but is in development, whereas the latter is nascent and interdisciplinary. Scientiﬁc computation is the basis for data mining analysis with respect to the collection and structured entry of symptoms in TCM according to the “disease-symptomsyndrome-prescription-effect” framework. These symptoms are investigated through image processing and pattern recognition technology to acquire information, such as the
Scientiﬁc computation of big data in real-world clinical research
color of the tongue and the texture, color, and luster of the face [11,11]. Signal processing, voice analysis, and pattern recognition technology are used to determine symptoms related to auscultation and scent, such as voice, coughs, breaths, and body odor [12,13]. Machine learning techniques are used to optimize the inquiry scale . Moreover, vibration signal analysis and pattern recognition techniques are applied to obtain pulse rate, rhythm, force, pattern, and type . In data mining, feature selection and classiﬁcation modeling can optimize the information on numerous clinical symptoms and obtain the optimal subset. The transition from symptoms to syndrome can then be modeled and simulated . Complex network and association rules can be applied in core prescription mining and its integration or removal . Furthermore, drug effects can be used to detect signiﬁcant interactions in TCM prescriptions for patients . These processes apply current and intelligent data processing technology to analyze the information on Chinese medicine. Related information technology studies combined with TCM concepts and data characteristics are lacking; therefore, Yin-Yang and the Five Elements theory is applied to machine learning in the form of the Bayesian Yin-Yang Intelligence System . Multi-label learning is proposed for the diagnosis of mixed TCM syndromes because it is highly accurate . The scientiﬁc computation of TCM emphasizes the scientiﬁc computation of the real-world clinical research paradigm of TCM that is related to TCM informatics and engineering in clinical research. TCM informatics is an emergent discipline from studies on Chinese medicine and information science based on the movement laws of dynamic phenomena. It considers overall and dynamic criteria and uses computer and network technology. Moreover, it examines the information phenomenon in the TCM ﬁeld and information law to exhibit, manage, analyze, simulate, and disseminate TCM information. In the process, data on essential internal relations are determined, converted, and shared . The scientiﬁc computation of TCM overlaps with TCM in the following ways. First, data are conceptually broader than information as computing centers. Therefore, TCM computation focuses on clinical data. Second, TCM is directed by the theories of complexity, intelligence, data, and computing sciences. It concentrates on the collection, analysis, and mining of clinical data to detect and establish rules, as well as a system of individualized clinical diagnosis and treatment. TCM engineering employs the theories, methods, and techniques of modern natural and engineering sciences synthetically under the guidance of TCM theory. In theoretical systems, experimental research, clinical care, education, scientiﬁc research, and production and management decision-making, the study of TCM engineering is exhaustive in all of the following aspects: interdisciplinary, multi-method, multi-approach, multi-tool, and multi-perspective (macro and micro). The promotion of TCM moderniza-
Guozheng Li et al.
tion, industrialization, and internationalization can address all kinds of problems in the construction of versatile technology platforms, including those related to theories, techniques, and practices . Furthermore, it contributes to life science and human health. TCM engineering focuses more on the engineering perspective and the scientiﬁc computation of TCM analyses than traditional computation science. It also mines the theories and techniques of TCM clinical diagnosis and treatment systems in the view of complexity and data sciences. Much data has been generated in various scientiﬁc ﬁelds in response to the urgent demand for scientiﬁc computation in the big data age. Systems and data characteristics in biological and social studies and biological and social computation were established successively ; therefore, the theories and techniques of data science are integrated into the analysis of biological and social data. The mechanism of biological and social systems improves the efﬁciency and the results of the computations. The scientiﬁc computation of TCM thus thrives by applying commonly advantageous technology as a result of the advancement of these disciplines. In the process, clinical research on TCM develops further.
Theoretical framework of the scientiﬁc computation of TCM The scientiﬁc computation of TCM involves humans, Chinese medicine, and relative medical knowledge, as in biological, social, and complex systems. Consequently, the behavior of the TCM system as a whole essentially cannot be determined through the independent analysis of individual parts given comparatively limited resources. Thus, this behavior should be predicted over a wide range of time or space. As a result, the scientiﬁc computation of TCM must be examined based on holism rather than reductionism. The main characteristics of TCM computation are comprehensive big data samples and correlations. Notably: (1) a holistic view of TCM is important; and (2) optimal and exclusive optimal solutions are generally non-existent. Therefore, we should
accept any effective solution. Overall, we aim to obtain effective scientiﬁc computation solutions for TCM using digital human body systems, objective collection, expert systems, knowledge engineering, parallel systems, complex system science, and data mining as guided by the principles of “clinic-to-clinic” and “continuous exploration and improvement,” in reference to social computing theory and technology . These aims are illustrated in the theoretical framework of the scientiﬁc computation of TCM in Fig. 1. Expert and parallel systems and knowledge engineering The TCM system is extremely complex. Moreover, the proposed real-world clinical research paradigm is “humancentered” and involves patients and doctors, who are both independent and interactive. Hence, system behavior cannot be effectively described by methods and models. In this respect, the relationship between doctor and patient is modeled by digital human body modeling, objective collection, expert systems, and knowledge engineering both individually and in combination. The digital human body system determines substance composition and generates the mechanical, mathematical, and information models of the human body system . The objective collection system simulates the acquisition of four TCM diagnostic symptoms using image, sound, smell, and pressure sensors . On this basis, knowledge rule, reasoning, and the artiﬁcial neural network can be used to build expert accessorial diagnosis and treatment systems on medical disease, chronic hepatitis, and sub-health  and to establish TCM knowledge engineering systems . The parallel TCM system consists of an expert system, knowledge engineering, and a practical system . Clinical scientiﬁc research can be managed through the collaboration and comparison of artiﬁcial and practical diagnoses with treatment behavior in the parallel system. The parallel system of TCM computation fundamentally compares and analyzes practical and artiﬁcial systems based on their connection to provide a “reference” and to “predict” their future status.
Fig. 1 Theoretical framework of the scientiﬁc computation of TCM.
Accordingly, this parallel system adjusts the control approach to obtain effective solutions to complex problems or to solve issues regarding the implementation of learning and training objectives. Analysis and mining of big data At present, most of the scientiﬁc computations of TCM utilize passive observation and statistics. Therefore, research objects are occasionally difﬁcult to experiment on initiatively and “repetitively.” Tests often yield results and conclusions that are highly ungeneralized because current controlled and randomized trials consider uncontrollable and unobservable factors. Thus, analytic reasoning methods cannot analyze TCM computation problems. Moreover, data mining, machine learning, and pattern recognition are important methods in the analysis and mining of big data. The methods of rule and knowledge acquisition from expert systems, knowledge engineering, and traditional expert inquiry are highly infeasible, as mentioned previously. Therefore, the study of feature selection, classiﬁcation, clustering, rule extraction, and complex network technology is particularly signiﬁcant for clinical tasks such as core optimization and syndrome-effect analysis. TCM is a discipline directed by the real-world TCM clinical research paradigm. It is based on complexity, intelligence, data, and computing sciences. In the theoretical frame mentioned previously, TCM computation methods incorporate the different advantages of various subjects, including: (1) Mathematical statistics is the most fundamental computation method and was vital in previous TCM studies. The bionic optimization method optimizes techniques based on the biological transportation mechanism, which can accomplish tasks such as symptom, core, and path optimization. (2) Data mining techniques effectively derive underlying knowledge from big data. The feature extraction and selection method extracts the essential features from these data and can uncover the knowledge and rules behind them. This method can be used to explore core symptoms as well. The classiﬁcation modeling method is not only used to simulate clinical concepts, but it also determines the diagnosis experience and knowledge of doctors. The association rules method can be used to determine the relationships among symptoms, syndromes, and prescription drugs in mining and in diagnosis. The disease-symptom-syndrome-prescriptioneffect relationship is most relevant. (3) TCM is a complicated system; therefore the complex network method can be utilized to study the complex diseasesymptom-syndrome-prescription-effect relationship. This relationship can be either direct or nonlinear. The method of system dynamics can simulate the attack processes of diseases to study the occurrence and development of different
Scientiﬁc computation of big data in real-world clinical research
diseases. (4) The system that simulates diagnosis process can be constructed using expert systems and knowledge engineering methods to determine medical treatment mechanisms. (5) The integration method incorporates the methods described above to individualize TCM diagnosis and treatment.
Application ﬁelds of the scientiﬁc computation of TCM The scientiﬁc computation of TCM combines scientiﬁc computation with the ﬁeld of Chinese medicine following biological and social computation. TCM is studied and applied by extracting quintessential information from complex and redundant data on the basis of Chinese medicine and complexity, intelligence, data and computing sciences. Therefore, the interaction between internal and external human factors and the relationships among etiology, pathogenesis, disease location and complex states must be explored through the treatment perspective. Some of the aspects that require investigation are as follows. Medical equipment for symptom determination Information can be acquired from multiple dimensions with the aid of wearable computing devices, including time, location, environment, and physiological and motion signals. Big data regarding the human body can be collected through the continuous and long-term monitoring of multi-dimensional signals. This health-related information is valuable and has a promising market outlook. The four diagnostic instruments in TCM measure four basic components: inspection, auscultation, inquiry, and palpation [11,11]. Structured knowledge system for electronic health recording Clinical data are vital ﬁrsthand information in TCM clinical research. Highly experienced TCM practitioners incorporate much empirical knowledge into clinical data to generate remarkable therapeutic effects [3,29]. Analysis of interactions among symptoms The relationships between symptoms must be determined to explore the subset of core symptoms. Most of the existing Chinese medical research based on machine learning does not consider the correlation between medical connotations and the symptoms described by the data. However, much of the TCM data on symptoms and syndromes are clearly deﬁned medically. Thus, the interactions between symptoms and syndromes must be studied along with the TCM concepts behind these associations [14,30].
Guozheng Li et al.
Analysis of the correlation between symptoms and syndromes (disease) Correlations can be analyzed in diagnosis modeling. Numerous clinical cases may contain one of various syndromes in the context of practical TCM data mining. This task can therefore be regarded as a multi-label classiﬁcation problem in machine learning. However, existing solutions related to multi-label classiﬁcation ignore the problem of unbalanced data and label inconsistency [16,20]. Analysis of the core prescription and the incorporation and removal of patent medicine Effective cures to diseases must be developed, and the incorporation and elimination of these cures must be examined as well. In line with these objectives, valuable information must be extracted from both historical and recent literature . Analysis of the relationship between prescription and efﬁcacy The matching of prescription and efﬁcacy assessments should be analyzed. The clinical efﬁcacy of TCM has been conﬁrmed, and the utility of different prescriptions is a hot topic in the progress of TCM research. Lu et al. examined nearly 6000 real cases of a pandemic and determined the effects of prescriptions from hospitals. TCM treated the fever during the pandemic more effectively than western medicine. Moreover, combinations of TCM and western medicine were rarely curative . Analysis of data obtained from biomedical instruments The Mars500 study is a psychological and physiological isolation experiment conducted by Russia, the European Space Agency, and China in preparation for a manned spaceﬂight to the planet Mars in the unspeciﬁed future. This study yields valuable psychological and medical data on the effects of the planned long-term mission in deep space. The experiment investigated the technical challenges, work capability of the crew, and management of long-distance spaceﬂight. The main concerns during the Martian ﬂight are health problems, the isolated conditions, and the hermetically closed and conﬁned environment. Li et al. described the regular pattern of syndromes using a statistical method and presented machine learning methods to mine the relationship between computerized symptoms and syndromes differentiated by experts. They screened out 10 key factors that are essential to syndrome differentiation in TCM using feature selection. The average precision of multi-label classiﬁcation model reached 80% in this study .
Application of bioinformatics technology in TCM Lu et al. determined the curative effect of the medicine etretinate based on a comparison of the treatment cycles of healthy and psoriasis-stricken patients using metabolomics technology. They noted that this technology can be used to analyze the effect of Chinese medicine .
Conclusions and perspectives This study comprehensively illustrated the key technology of “the real-world clinical research paradigm of TCM” in consideration of big data characteristics. This paradigm is “data-oriented” and is mainly presented as “the alternation of medical and scientiﬁc practices with computation.” Furthermore, this study proposes the research directions of the scientiﬁc computation of TCM. These directions strengthen the research on TCM problems and promote cooperation between TCM clinical researchers and information experts. Therefore, the main content of the current research and the methods of different levels follow these directions. The scientiﬁc computation of TCM can be distinguished from other scientiﬁc computations based on its characteristics. Thus, we establish a new scientiﬁc computation branch for TCM called TCM computation or Chinese medical computation. The study of TCM computation is based on a certain foundation; however, we must consider the development of the real-world clinical research paradigm. Scientiﬁc computation is used to develop solutions and to advance indepth studies on clinical research processes. Thus, computation methods should be developed in line with the characteristics of Chinese medical data. Big data accumulate unexpectedly; thus, in-depth research is a popular technique that utilizes these data. Furthermore, TCM data analysis has special requirements; therefore, novel algorithms should be developed or existing in-depth research algorithms should be revised for application to TCM data. The research direction is expected to facilitate the implementation and advancement of the real-world clinical research paradigm to enhance TCM and to contribute to the development of the medical ﬁeld.
Acknowledgements This work was supported by the National Natural Science Foundation of China (Grant No. 61273305) and the Fundamental Research Funds for Central Universities.
Compliance with ethics guidelines Guozheng Li, Xuewen Zuo, and Baoyan Liu declare that they have no conﬂict of interest. This article does not contain any studies with human or animal subjects as performed by any of the authors.
References 1. Special Online Collection. Dealing with data. Science Feb.11, 2011. http://www.sciencemag.org/site/special/data/ 2. Nature special issue. Big data: science in the petabyte era. Nature, September 4, 2008. http://www.nature.com/nature/journal/v455/ n7209/edsumm/e080904-01.html 3. Liu BY, Zhou XZ, Li P, Wang YH, Wen TC, Guo YF, Zhang RS, Chen SB. A uniﬁed clinical and research information platform toward individualized medicine. China Digit Med (Zhongguo Shu Zi Yi Xue) 2007; 2(6): 20 (in Chinese) 4. Yu L, Lu Y, Tian YM, Zhu XL. Research on architecture and key technology of Internet of Things in hospital. Transducer Microsyst Technol (Chuan Gan Qi Yu Wei Xi Tong) 2012; 31(6): 76–78,86 (in Chinese) 5. Liu BY. A navigational chart for contemporary traditional Chinese medicine to be drawn based on big data. China News Tradit Chin Med (Zhongguo Zhong Yi Yao Bao). 2013-6-5. 3 (in Chinese) 6. Liu BY. Clinical research paradigm of Traditional Chinese medicine in real world. J Tradit Chin Med（Zhong Yi Za Zhi) 2013; 54(6): 451–455 (in Chinese) 7. Huang XR. Complexity science. Chongqing: Chongqing University, 2012 (in Chinese) 8. Shi Z. Intelligence science. Beijing: Tsinghua University Press, 2006 (in Chinese) 9. Li GZ, Zeng XQ. Research progress in Chinese clinical medical data analysis and mining. Int J Biomed Eng (Guo Ji Sheng Wu Yi Xue Gong Cheng Za Zhi) 2013; 36(2): 88–92 (in Chinese) 10. Li F, Zhao C, Xia Z, Wang Y, Zhou X, Li GZ. Computer-assisted lip diagnosis on traditional Chinese medicine using multi-class support vector machines. BMC Complement Altern Med 2012; 12(1): 127 11. Shi MJ, Li GZ, Li FF, Xu C. Computerized tongue image segmentation via the double geo-vector ﬂow. Chin Med 2014; 9 (1): 7 12. Gao YT. Wenzhen. Beijing: Chinese Ancient Resources, 2008 (in Chinese) 13. Guo D, Zhang D, Li N, Zhang L, Yang J. A novel breath analysis system based on electronic olfaction. IEEE Trans Biomed Eng 2010; 57(11): 2753–2763 14. Shao H, Li GZ. Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine. Sci China Inf Sci 2013; 56(5): 1–13 15. Huang X, Tang W, Li R, Huang H. Comparative study on synchronous test of three-position pulse and simple pulse test. Chin J Basic Med Tradit Chin Med (Zhongguo Zhong Yi Ji Chu Yi Xue Za Zhi) 2005; 11(3): 210–234 (in Chinese) 16. You M, Li GZ. Medical diagnosis by using machine learning techniques. In: Josiah Poon, Simon Poon. Data Analytics for Traditional Chinese Medicine Research. Springer, 2014: 39–80 17. Zhou XZ, Liu BY, Wang YH, Zhang RS, Yao NL, Cui M. Study of compound drugs based on complex network method. Chin J Inf Tradit Chin Med (Zhongguo Zhong Yi Yao Xin Xi Za Zhi) 2008; 15 (11): 98–100 (in Chinese) 18. Poon SK, Poon J, McGrane M, Zhou X, Kwan P, Zhang R, Liu B, Gao J, Loy C, Chan K, Sze DM. A novel approach in discovering signiﬁcant interactions from TCM patient prescription data. Int J Data Min Bioinform 2011; 5(4): 353–368
Scientiﬁc computation of big data in real-world clinical research 19. Xu L. On essential topics of BYY harmony learning: current status, challenging issues, and gene analysis applications. Front Electr Electron Eng 2012; 7(1): 147–196 20. Liu GP, Li GZ, Wang YL, Wang YQ. Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complement Altern Med 2010; 10(1): 37 21. Cui M, Yin A, Li H. The establishment of Chinese medicine information study. J Tradit Chin Med (ZhongYi Za Zhi) 2008; 49 (3): 267–278 (in Chinese) 22. Shi Z, Zhao S. New progress of study in engineering area of traditional Chinese medicine. World Sci Technol—Modernization Tradit Chin Med Mater Med (Shi Jie Ke Xue Ji Shu—Zhong Yi Yao Xian Dai Hua) 2005; 7(1): 127–132,89 (in Chinese) 23. Jin Y, Hu G, Wang K. Computational Biology: analysis and applications of biological sequence. Beijing: Science Press, 2010 (in Chinese) 24. Wang FY. Social computing: science and technology, humanities. Bull Chin Academy Sci 2005; 20(5): 370–376 25. Bi SW. Digital Human Body—Human Body Digital Science. Beijing: Science Press, 2004 (in Chinese) 26. Wang Q. Study on the current situation of four diagnostic methods of TCM. J Tradit Chin Med (Zhong Yi Za Zhi) 2000; 41(4): 242– 245 (in Chinese) 27. Zhang DZ, Peng JN, Fan HX. Present research situation and prospect of Chinese medicine expert system. Appl Res Comput (Ji Suan Ji Ying Yong Yan Jiu) 2007; 24(12): 6–9 (in Chinese) 28. Sun Y. Research progress analysis of Chinese medical knowledge engineering. Chin J Inf Tradit Chin Med (Zhongguo Zhong Yi Yao Xin Xi Za Zhi) 2010; 17(12): 5–6 (in Chinese) 29. You M, Li GZ, Zeng XQ, Ge L, Bi L, Huang S, Yang JY, Yang MQ. A personalized traditional Chinese medicine system in the case of Cai’s Gynecology. Int J Funct Inform Personal Med 2008; 1(4): 419–438 30. Li GZ, Sun S, You M, Wang YL, Liu GP. Inquiry diagnosis of coronary heart disease in Chinese medicine based on symptomsyndrome interactions, Chin Med 2012; 7(1): 9 31. Huang SY, Fang SC, Liu H, Zhang R, Wang CY, Bi L, Zhang L, Xu L, Li GZ, Wang HL. Su LN. The technology to carry out the old doctor of traditional Chinese medicine academic experience inheritance of global design examples of the application of data mining. J Shanghai Tradit Chin Med (Shanghai Zhong Yi Yao) 2011; 45(9): 1–3 (in Chinese) 32. Lu C, Zhou H, Luo Y, Chen B, Qin X, Wen Z, Li J, Ou AH, Ouyang W, Li X, Huang T, Liang Z, Yan S, Li GZ. Comparison of the clinical outcomes of Chinese, western, and integrative medicine for the treatment of H1N1 inﬂuenza A in China. Evid Based Complement Alternat Med (in press) 33. Li Y, Li GZ, Gao JY, Zhang Z, Fan Q, Xu J, Bai G, Chen K, Shi H, Sun S, Liu Y, Shao F, Mi T, Jia X, Zhao S, Chen J, Liu J, Guo Y. Syndrome differentiation analysis on MARS500 data of traditional Chinese medicine. Evid Based Complement Alternat Med (in press) 34. Lu C, Deng J, Li L, Li GZ. Application of metabolomics on diagnosis and treatment of patients with psoriasis in traditional Chinese medicine. Biochim Biophys Acta - Proteins and Proteomics 2014; 1844(1), Part B: 280–288
Electronic medical record (EMR) system has been widely used in clinical practice. Instead of traditional record system by hand writing and recording, the EMR makes big data clinical research feasible. The most important feature of big data research i
With the advent of big data era, our thinking, technology and methodology are being transformed. Data-intensive scientific discovery based on big data, named "The Fourth Paradigm," has become a new paradigm of scientific research. Along with the deve
In epidemiological research, large datasets are essential to reliably capture small variations among comparative groups or detect new unsuspected associations. Although large databases of web-search information, social media, airline traffic and tele
Large data generated by scientific applications imposes challenges in storage and efficient query processing. Many queries against scientific data are analytical in nature and require super-linear computation time using straightforward methods. Spati
Behavioral big data (BBD) refers to very large and rich multidimensional data sets on human and social behaviors, actions, and interactions, which have become available to companies, governments, and researchers. A growing number of researchers in so