Abstract
The Machine
Learning (ML) field has gained its momentum in almost any domain of research
and just recently has become a reliable tool in the medical domain. The
empirical domain of automatic learning is used in tasks such as medical
decision support, medical imaging, protein-protein interaction, extraction of
medical knowledge, and for overall patient management care. ML is envisioned as
a tool by which computer-based systems can be integrated in the healthcare
field in order to get a better, more efficient medical care. This paper
describes a ML-based methodology for building an application that is capable of
identifying and disseminating healthcare information. It extracts sentences
from published medical papers that mention diseases and treatments, and
identifies semantic relations that exist between diseases and treatments. Our
evaluation results for these tasks show that the proposed methodology obtains
reliable outcomes that could be integrated in an application to be used in the
medical care domain. The potential value of this paper stands in the ML
settings that we propose and in the fact that we outperform previous results on
the same data set
GOAL OF
PROJECT
The work that we
present in this paper is focused on two tasks: automatically identifying
sentences published in medical abstracts (Medline) as containing or not
information about diseases and treatments, and automatically identifying
semantic relations that exist between diseases and treatments, as expressed in these
texts. The second task is focused on three semantic relations: Cure, Prevent,
and Side Effect
ANALYSIS ON EXISTING SYSTEM
In order to
embrace the views that the EHR system has, we need better, faster, and more
reliable access to information. In the medical domain, the richest and most
used source of information is Medline,4 a database of extensive
life science
published articles. All research discoveries come and enter the repository at
high rate (Hunter and Cohen [12]), making the process of identifying and
disseminating reliable information a very difficult task The tasks that are
addressed here are the foundation of an information technology framework that
identifies and disseminates healthcare information. People want fast access to
reliable information and in a manner that is suitable to their habits and
workflow. Medical care related information (e.g., published articles, clinical
trials, news, etc.) is a source of power for both healthcare providers and laypeople.
Studies reveal that people are searching the web and read medical related
information in order to be informed about their health. Ginsberg et al. [10]
show how a new outbreak of the influenza virus can be detected from search
engine query data
PROBLEM
DEFINITION
The problems
addressed in this paper form the building blocks of a framework that can be
used by healthcare providers (e.g., private clinics, hospitals, medical
doctors, etc.), companies that build systematic reviews8 (hereafter,
SR), or
laypeople who want to be in charge of their health by reading the latest life
science published articles related to their interests. The final product can be
envisioned as a browser plug-in or a desktop application that will
automatically
find and extract
the latest medical discoveries related to disease-treatment relations and
present them to the user. The product can be developed and sold by companies that
do research in Healthcare Informatics, Natural Language
Processing, and
Machine Learning, and companies that develop tools like Microsoft Health Vault.
The value of the product from an e-commerce point of view stands in the fact
that it can be used in marketing strategies to show that
the information
that is presented is trustful (Medline articles ) and that the results are the
latest discoveries. For any type of business, the trust and interest of
customers are the key success factors
Disadvantage
IDEA ON PROPOSED SYSTEM
Our objective
for this work is to show what Natural Language Processing (NLP) and Machine
Learning (ML)
techniques—what
representation of information and what classification algorithms—are suitable
to use for identifying and classifying relevant medical information in short
texts. We acknowledge the fact that tools capable of identifying reliable
information in the medical domain stand as building blocks for a healthcare
system that is up-to-date with the latest discoveries. In this research, we
focus on diseases and treatment information, and the relation that
exists between
these two entities. Our interests are inline with the tendency of having a
personalized medicine, one in which each patient has its medical care tailored
to its needs. It is not enough to read and know only about one study that
states that a
treatment is beneficial for a certain disease. Healthcare providers need to be
up-to-date with all new discoveries about a certain treatment, in order to
identify if it might have side effects for certain types of patients. We
envision the potential and value of the findings of our work as guidelines for
the performance of a framework that is capable to find relevant information
about diseases and treatments in a medical domain repository. The results that
we obtained show that it is a realistic scenario to use NLP and ML techniques
to build a tool, similar to an RSS feed, capable to identify and disseminate
textual information related to diseases and treatments. Therefore, this study
is aimed at designing and examining various representation techniques in
combination with various learning methods to identify and extract biomedical relations
from literature.
No comments:
Post a Comment