Published on International Journal of Informatics, Technology & Computers
Publication Date: December, 2019
Amapwan Regina Ayitat & Blamah Nachamada Vachaku
Department of Computer Science, University of Jos
The emergence of Information and Communication Technology (ICT) has opened opportunities in healthcare delivery as the demands for intelligent systems have increased due to advances in technology especially in medical practices. The inherent complexities involved in the medical practices make the traditional approaches of diagnosis and predicting treatment outcome of diseases inappropriate. The task of arriving at an accurate medical diagnosis may sometimes become very complex and difficult for the typical expert doctor. Recently, Artificial Intelligence (AI) especially machine learning techniques have been applied in these medical practices to offer complementary solutions to the drawbacks of the traditional approaches. In this research, an intelligent model was developed in order to counter the disadvantages of the traditional approaches. The model was developed based on the integration of Artificial Neural Network (ANN) and logical inference into making a model for the diagnosis of hepatitis B virus disease. The structure of the ANN model is that of the multilayer perceptron feed forward neural network which uses the back propagation algorithm. Sigmoid function was used as the threshold function of the model, this normalized the input and output variables in the range of [0,1] interval. The data used was gotten from the UCI machine learning repository and it was normalized first in Excel spreadsheet before it was fed into the model for training. Neurophstudio was used for data analysis and developing the ANN model. The system was designed and implemented using Java programming language on the NetBeans IDE with MySQL as the database server. Object oriented methodology was adopted using the Unified Modeling Language (UML) to depict the logical view of the system. The nominalized dataset contains a total of 155 records. This was divided into two sets, a training set containing 145 records used to build and train the model and the testing set containing 10 records used to test and evaluate the diagnostic and predictive ability of the model. The results obtained from both training and testing sets were evaluated and compared with those of the actual results and they were excellent and accurate based on the comparisons and results obtained. The results of the research show that ANN can effectively simulate the behavior of a medical expert and so it can successfully be employed in designing a computer based diagnostic model for diagnosing hepatitis B virus disease.
Keywords: Intelligent systems, Medical diagnosis, Artificial Intelligence, machine learning techniques, Artificial Neural Network, hepatitis B virus disease, back propagation algorithm, Sigmoid function.
Diagnosis as defined by the oxford advanced dictionary is said to be an act of identifying the exact cause of an illness or a problem. In the medical field, diagnosis is defined as the analysis of the physiological or biochemical cause of a disease (Sana, Imran, Aizai, Jamil, & Syed, 2011). Traditionally, physicians learn how to go about diagnosing during their formation years while at training. They are able to deduce the problems of certain diseases or formulate treatment for these diseases when consulted on the basis of more or less specified observations and knowledge obtained, combine with their experiences as a guide. For consistency, continuous training and recertification encourages these physicians and helps them to keep most of the relevant information constantly in mind, but the limitations of human memory and inability to recall coupled with vast growth of available and accessible knowledge make the physicians not to know all. In special cases, specialists are needed for diagnosing and treatment of high-risk diseases because not all physicians have the adequate expertise or experience in handling these diseases. These specialized physicians are few and so typical waiting time to get access to them for treatments may take longer than expected, increasing the risk of the diseases in the patients. The emergence of Information and Communication Technology (ICT) has helped in solving these problems. The services of a medical assistance such as a Medical Expert System could be used by the physicians for quality patient management which will aid in making early and simpler diagnosis and treatment of patients.
In this research, an intelligent model for the diagnosis of Hepatitis B disease is proposed. The objectives of this research work are to identify the problem, understand artificial neural network, its application in medical diagnosis and on hepatitis B. To also develop an intelligent model using ANN to serve as expertise for the diagnosis of Hepatitis B virus disease, to carefully select variables from the hepatitis B dataset to serve as both input and output variables for training and testing of the artificial neural network model to be developed. To incorporate the intelligent developed into the developed system in order to properly manage both existing and upcoming Hepatitis B virus infected patients.
2. Literature Review
Hepatitis B Virus (HBV) disease is a major cause of morbidity and mortality worldwide (Aspinall, Hawkins, Fraser, Hutchinson & Goldberg, 2011). This is an infection of the liver cells which is caused by either the human pathogen, Hepatitis B virus (HBV), or toxins injected into the system such as alcohol. The HBV is a small, incompletely double-stranded, circular DNA gemone, hepadnavirus which is 42 nm diameter wide (Thad, Dave, & Robert, 2010), belonging to the virus family hepadnaviridae. This virus interferes with the functions of the liver while replicating in the hepatocytes (the liver), this then activates the immune system to produce a specific reaction to combat and possibly eradicate the causative agent and as a result damages the liver which becomes inflamed in the process.
Transmission is via percutaneous or permucosal exposure to an infected blood or other body fluids with these fluids remaining infectious even after a week of exposure. The transmission can be through sexual intercourse, sharing of syringes and needles, mother-to-child, blood transfusion, and organ transplantation (Mauss et al, 2016) even though most infected persons look perfectly healthy showing no symptoms of the disease.
Symptoms during the incubation phase are nonspecific making the detection of the disease difficult. They include fatigue, poor appetite, nausea, vomiting, abdominal pain, low grade fever, jaundice, and dark urine while clinical signs include liver tenderness, liver enlargement, and spleen enlargement etc. with the severity of the disease depending on the patient’s age at infection, immune status, stage at which the disease was recognized, liver biopsy test etc. (Who, 2002). A vaccine is available that will prevent infection from hepatitis B for life (Karlik, 2012). They are alpha-interferon, adefovir and lamivudine.
Artificial Intelligence (AI) is a study to emulate human intelligence into computer technology. The potential of AI in medicine has been expressed by a number of researchers. Djam and Kimbi (2011) summarized the potential of AI techniques in medicine as follows:
i. Produces new tools to support medical decision making, training and research.
ii. Integrates activities in medical, computer, cognitive and other sciences thereby offering a content-rich discipline for next generation healthcare.
iii. An Expert system improves the quality of medical services and reduced delay of treatments.
With all these, many intelligent systems have been developed for the purpose of enhancing the healthcare sector. Raenu and Rafidah (2016) developed a web based online medical diagnosis system which they called WOMEDS. The system enables patients to do diagnosis for their health problem themselves and also provides some health monitoring tips for them to follow, doctors too can use the system to do further diagnosis on a particular patient based on references gotten from the database of the system. The use of web to get the services of the system pose a major setback for this particular research as not all patients have the resources to access the web with.
Expert systems have already been applied in a number of different applications in medicine. Early studies in intelligent medical system such as MYCIN and CASNET have shown to outperform manual practice of diagnosis in several disease domains (Djam, 2013). MYCIN was developed in the early 1970s to diagnose certain antimicrobial infections and recommends drug treatment. It has several facilities such as explanation facilities, knowledge acquisition facilities, teaching facilities and system building facilities. Causal Associational Networks (CASNET) which was developed in early 1960s as a general tool for building expert system for the diagnosis and treatment of diseases. CASNET major application was the diagnosis and recommendation of treatment for glaucoma.
An artificial neural network (ANN) is a computational model that attempts to account for the parallel nature of the human brain (Sumana, Anjan & Sendhil, 2013). They represent a simplified mathematical model of central nervous system and just like the brain, they can recognize patterns, manage data, and, most important, learn.
Bascil and Oztekin (2012) carried out a comparative study on hepatitis disease diagnosis. The aim of their study is to use a probabilistic neural network (PNN), PNN model is a network formulation of probability density estimation that is based on competitive learning with a winner takes all attitude whose core concept is based on multivariate probability, in the diagnosis of hepatitis disease. Though the aim of the study was achieved and the results obtained compared with other related works in its literature using classification accuracy as a measure of performance evaluation, but the study only classified the patients as either live or die without specifying the severity of the disease in the patients.
Dakshata and Seema, (2011) and Mahesh, Kannan and Saravanan (2014) both proposed an expert system using artificial neural network for the diagnosis of hepatitis B. Both group of researchers used a generalized regression neural network (GRNN), GRNN is a type of neural network that uses kernel based approximation to perform regression and is one of the Bayesian networks, incorporated in an expert system to diagnose and give the severity of the disease if positive. Though both studies were successful, but they both failed to explicitly elaborate on their methodology, also GRNN is only used for regression problems.
Venkatesam and Penchalaiah (2015) proposed a genetic pattern recognition system in the diagnosis of hepatitis B and C. The aim of their study is use ANN and a genetic algorithm (a genetic algorithm is a searching technique used in computing to find the exact or the optimal solutions to optimization and search problems) in diagnosing of hepatitis B and C virus. The ANN models the data and generates casual patterns for the genetic data set where it was trained to learn the casual patterns while the genetic algorithm selects the generated patterns and analyzes them based on its genetic knowledge for prediction. At the end of their studies, they obtained an accuracy of 98% in the prediction of the diseases.
3. Data Analysis/ Findings
This section presents the proposed solution to the problem being addressed. The methodology adopted in the research is the object-oriented design methodology based on artificial neural network (ANN) modeling which is used for the development of the system. Object-oriented technique allows for the creation of meta classes and the concept of inheritance for which other methodologies are inappropriate. ANN modeling has the ability of describing the complexities in medical systems as it is capable of mapping some input to desired output as long as both input and output are transformed to appropriate numerical forms. The strength of an expert system at all times is the knowledge it contains in the knowledge-base of the inference engine. Therefore, for knowledge extraction, the hepatitis dataset and knowledge acquired during research will be used to generate meaningful rules for the knowledge-base in order to aid in making the diagnosis of hepatitis B virus (HBV) disease. ANN will be incorporated in the system in order to aid with the diagnosis and generating the disease severity, making the system an intelligent system.
3.1.1 Overall system description
This is an intelligent, standalone knowledge based, hepatitis diagnostic system based on ANN which attempts to incorporate ANN into developing an intelligent system for the diagnosis of HBV disease. The ANN model is built using neurophStudio, an ANN framework that is based on the java programming language. The system diagnoses HBV disease in humans using some blood markers and signs/symptoms incorporated in object oriented java programming language for functionality. The system works in an offline mode and could run on the windows command prompt. The system communicates with the user, in this case the medical doctor, using common understandable language with no special knowledge required for the medical doctor to use the system in order to diagnose and effectively manage the patients infected with HBV disease and their data. In addition, the diagnosis section, based on the status of the disease either negative or positive allows one to carry out further diagnosis on a patient. If positive the system generates the severity of the disease and if negative, the system terminates diagnosis.
3.1.2 Method used: ANN
In this system, an artificial neural network model is used in order to make some predictions regarding classification of the disease severity in patients infected with HBV disease. The developed ANN model is based on one of the neural network architecture called multilayer perceptron network (MLPN) also known as multilayer feedforward network with one hidden layer aside its input and output layers. This is the most popular network architecture in use today. This type of network allows each neurons in the network to perform a biased weighted sum of their inputs and pass this activation level through an activation function to produce their output.
Important issues in MLPN design include specification of the number of hidden layers and the number of neurons in these layers. Once the number of layers and number of units in each layer have been selected, then the network’s weights and activation functions must be set so as to minimize the prediction error made by the network. This is where the learning algorithm comes in for training. The learning algorithm shortens the training time while achieving a better accuracy. The learning algorithm to be used in this research is the backpropagation (BP) learning algorithm. The BP algorithm is based on searching an error surface using gradient descent for points with minimum error. Iteration in BP constitutes of two steps, a forward activation to produce a solution and a backward propagation of the computed error to modify the weights. The activation function the BP algorithm uses is the sigmoid activation function. The function transforms the input, which can have any value between plus and minus infinity, into a reasonable value in the range between 0 and 1. This can be represented in the equation below:
3.1.3 Usecase Diagram
Below is the use case for this system with medical doctor as the only actor as he is the only individual that interacts with the system.
Fig 1: Use cases for the Medical Doctor
3.1.4 Data collection and analysis
For data collection, the hepatitis dataset gotten from the unified machine learning repository is used (http://archive.ics.uci.edu.ml/datasets/Hepatisis). This dataset is made available mainly for AI related research purposes. This is to minimize the cost, stress and time factor needed to source for live clinical records of patients from hospitals. The dataset contains a total of 155 samples with 75 of them having missing attributes and a total of 20 attributes.
For data analysis, the data gotten was carefully observed and some changes were made to suite the aim of the research work. The missing attributes were filled with random values. Two (2) attributes entry were deleted from the dataset making it contain a total of 18 attributes, 17 input attributes and 1 output attribute. The output attributes name was changed in order for it to balance with its new functionality.
Finally, the dataset was categorized into two different sets:
i. Dataset A used for the model training. The dataset is used to build the model.
ii. Dataset B used for the model testing. This is to avoid obtaining biased results where a separate data set is fed into the ANN diagnostic predictive module to effectively test its feasibility and accuracy.
3.1.5 Model development
After data collection, two data preprocessing procedures are conducted to train the ANN more efficiently. The two procedures involved in data prepossessing are: solving the problem of missing data and normalization. The missing data are replaced by the average of neighboring values in the dataset. The data is nominalized because neural networks generally provide improved performance with normalized data as the use of original data as input to neural network may cause a convergence problem. The hepatitis data sets were then afterwards transformed into values between 0 and 1 with guidance from a ranking concept. Linguistic variables such as Yes and No were used for none measuring concepts and Mild and Severe for measuring concepts as shown in the table below.
Table 1: Ranking of Input/ Output variables Hepatitis Disease
Data nominalization is performed mainly on the none measurable concepts in excel spreadsheet package using the formula below:
The data was saved as a file to be loaded and used during the model training and testing phase.
The constructed sets for the input/output parameters (concepts) is given below based on the data gotten but with a little bit of modification to suit the aim and objective of the research.
Table 2: Data sets for accessing the severity of Hepatitis B
3.2.1 Experimental results for the diagnosis of hepatitis b virus diseases during training
The system performance was illustrated by means of simulating the 145 cases numerous times for the HBV diseases during the training phase. In each of these cases, there is a total of 17 input variables and 1 output result. This was made this way as the model is a supervised type. This experiment was carried out numerous number of times in order to pick the best structure to use by adjusting the hidden neurons and changing/manipulating its training parameters. The solutions gotten in the experiment shows that the choice of the number of hidden neurons is crucial to the effectiveness and performance of any neural network. It also showed that one layer of hidden neuron is enough for this particular research. Also, the experiment showed that the success of a neural network is very sensitive to the parameters chosen when one needs to start the training process. Some hints are that the learning rate must not be too high and the maximum error must not be too low. Next, the results showed that the total mean square error does not reflect directly the success of a network training, it can sometimes be misleading. So in order to check avoid this and also check the accuracy of the model, individual errors made for every input was observed. Below is a table that summarizes some of the experiments carried out. This helped in picking the right model to be used in the hepatitis diagnostic system. The table below gives a summary of the training experimental result carried out during the ANN model training phase.
Table 3: Summary Training Experimental Results
3.2.2 Experimental results for diagnosis of hepatitis B virus disease during testing phase
For the purpose of only documentation, just 2 of the testing phase will be discussed here. In the testing phase, 17 input values are fed to the model with no output. The model then does the simulations based on the knowledge it acquired during the training phase and then generates the output classifying the infection either mild or severe based on the concepts defined in chapter 4.
First test scenario: For this scenario, the model was fed in with the following input variables
The calculated value of the output Y was exactly 1 matching that of the expected result, indicating that patient whose symptoms were entered in the form of the variables above was diagnosed for severe HBV disease.
Second test scenario: For this scenario, the model was fed in with the following input variables
The calculated value of the output Y was 0.214 which matching that of the expected result when approximated, indicating that patient whose symptoms were entered in the form of the variables above was diagnosed for mild HBV disease based on the concepts defined earlier.
3.2.3 System testing
Table 4: Test Cases
In conclusion, the importance of ANN in the medical field cannot be overemphasized. Without sound diagnosis and accurate treatment, medical practice is as good as guess work. As proven in this work, computer aided diagnosis has the potentials to transform the traditional clinical practice of doing things to a more evidence driven method.
This research work will encourage other researchers to take bold steps and uniquely apply ANN to solve problems in the medical field and other areas of the economy such as weather forecasting, crime prediction, stock market prediction etc. thereby showcasing the flexibility and scalability of the ANN method. This will equally open the doors for the next generation of intelligent systems to be made available.
The developed model can be implemented in clinical practice and provides faster, accurate and more reliable process to carry out diagnosis. This research has proven that the medical sector has a lot more to gain from Artificial Intelligence.
Wilkins, T., Zimmerman, D., & Schade, R. R. (2010). Hepatitis B: diagnosis and treatment. Health care, 100, 6.
Hornby, A. S., & Wehmeier, S. (1995). Oxford advanced learner’s dictionary (Vol. 1428). Oxford: Oxford university press.
Mauss, S., Berg, T., Rockstroh, J., Sarrazin, C., Wedemeyer, H., & Kamps, B. S. (2014). Hepatology-A clinical textbook.
Sumana, G., Babu, G. A., & Kumar, R. S. (2013). Diagnosis of Glomerulonephritis by an ANN Based on Physical Symptoms and Clinical Observations of the Blood Samples. In Proceedings of the World Congress on Engineering (Vol. 2).
Negnevitsky, M. (2005). Artificial intelligence: a guide to intelligent systems. Pearson Education.
Russell, S., Norvig, P., & Intelligence, A. (1995). A modern approach. Artificial Intelligence. Prentice-Hall, Egnlewood Cliffs, 25, 27.
Djam, X. Y., & Kimbi, Y. H. (2011). Fuzzy expert system for the management of hypertension. The Pacific Journal of Science and Technology, 12(1), 390402.
Kolandaisamy, R., & Noor, R. (2016). Web Based Online Medical Diagnosis System (WOMEDS). International Journal of Computer Science Innovations and Technologies, 1(1), 47-52.
Rezaee, K., Haddadnia, J., & Rasegh Ghezelbash, M. (2014). A Novel Algorithm for Accurate Diagnosis of Hepatitis B and Its Severity. International Journal of Hospital Research, 3(1), 1-10.
Raoufy, M. R., Vahdani, P., Alavian, S. M., Fekri, S., Eftekhari, P., & Gharibzadeh, S. (2011). A novel method for diagnosing cirrhosis in patients with chronic hepatitis B: artificial neural network approach. Journal of medical systems, 35(1), 121-126.
Bascil, M. S., & Oztekin, H. (2012). A study on hepatitis disease diagnosis using probabilistic neural network. Journal of medical systems, 36(3), 1603-1606.
Venkatesam, M., Penchalaiah, P., Kurnool, A. P., & Nellore, A. P. International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS).
Kh, R., Rasegh Ghezelbash, M., Chagha Ghasemi, N., & Haddania, J. (2012). An intelligent diagnostic system for detection of hepatitis usi, ng multi-layer perceptron and colonial competitive algorithm. J Math Com Sci, 4, 237-245.
Panchal, D., & Shah, S. (2011). artificial intelligence based expert system for hepatitis B diagnosis. international journal of modeling and optimization, 1(4), 362.
Mahesh, C., Kannan, E., & Saravanan, M. S. (2014). Generalized regression neural network based expert system for hepatitis b diagnosis. Journal of Computer Science, 10(4), 563.
Khorashadizade, N., & Rezaei, H. (2015). New method for rapid diagnosis of Hepatitis disease based on reduction feature and machine learning. Journal of Advanced Computer Science & Technology, 4(1), 148-155.
Ansari, S., Shafi, I., Ansari, A., Ahmad, J., & Shah, S. I. (2011, December). Diagnosis of liver disease induced by hepatitis virus using artificial neural networks. In Multitopic Conference (INMIC), 2011 IEEE 14th International (pp. 8-12). IEEE.
Sartakhti, J. S., Zangooei, M. H., & Mozafari, K. (2012). Hepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA). Computer methods and programs in biomedicine, 108(2), 570-579.
Karlik, B. (2011). Hepatitis disease diagnosis using backpropagation and the naive bayes classifiers. BURCH Journal of Science and Technology, 1(1), 4962.
Neshat, M., Sargolzaei, M., Nadjaran Toosi, A., & Masoumi, A. (2012). Hepatitis disease diagnosis using hybrid case based reasoning and particle swarm optimization. ISRN Artificial Intelligence, 2012.
Wang, D., Wang, Q., Shan, F., Liu, B., & Lu, C. (2010). Identification of the risk for liver fibrosis on CHB patients using an artificial neural network based on routine and serum markers. BMC infectious diseases, 10(1), 251.
Jilani, T. A., Yasin, H., & Yasin, M. M. (2011). PCA-ANN for classification of Hepatitis-C patients. Int J Comput Appli (0975–8887), 14(7).
Djam, X. Y., & Kimbi, Y. H. (2011). A decision support system for tuberculosis diagnosis. The Pacific Journal of Science and Technology, 12(2), 410-425.
Aspinall, E. J., Hawkins, G., Fraser, A., Hutchinson, S. J., & Goldberg, D. (2011). Hepatitis B prevention, diagnosis, treatment and care: a review. Occupational medicine, 61(8), 531-540.
World Health Organization, & World Health Organization. (2002). Department of communicable disease surveillance and response.
WHO/CDS/CSR/DRS/2002.2: HEPATITIS B.
Djam, X. Y. (2013). A framework for a dynamic medical expert system for the management of tropical diseases.