Abstract:Abstract. Temporal information is useful in many NLP applications, such as information extraction, question answering and summarization. In this paper, we present a temporal parser for extracting and normalizing temporal expressions from Chinese texts. An integrated temporal framework is proposed, which includes basic temporal concepts and the classification of temporal expressions. The identification of temporal expressions is fulfilled by powerful chart-parsing based on grammar rules and constraint rules. We… Show more
We introduce a rule based tagger of temporal expressions, ITUTime, for detecting and normalizing temporal expressions in Turkish language. The proposed system is morphologically aware and does not require any preprocessing steps, since ITUTime operates on free text. We also establish the first temporally annotated dataset of Turkish language. The work presented here locates itself as a baseline for tagging Turkish temporal expressions. The proposed system is evaluated on manually annotated test dataset and achieved 0.89 F1 score on recognizing and 0.89 F1 score on normalization.
We introduce a rule based tagger of temporal expressions, ITUTime, for detecting and normalizing temporal expressions in Turkish language. The proposed system is morphologically aware and does not require any preprocessing steps, since ITUTime operates on free text. We also establish the first temporally annotated dataset of Turkish language. The work presented here locates itself as a baseline for tagging Turkish temporal expressions. The proposed system is evaluated on manually annotated test dataset and achieved 0.89 F1 score on recognizing and 0.89 F1 score on normalization.
“…As defined in [4], Chinese temporal expression can be classified as precise temporal expressions, fuzzy temporal expressions, modified temporal expressions, set-denoting temporal expressions, non-specific temporal expressions and so on. In [10], temporal parser for extracting and normalizing temporal expressions from Chinese texts is presented, the author also propose a temporal framework, which include basic temporal objects and relations, the measurement and classification of temporal expressions. For the purpose of inferring temporal information from multiple-clause sentences, a computational model based on machine learning and heterogeneous collaborative bootstrapping is build in [11] and the effects of linguistic features such as tense/aspect, temporal connectives, and discourse structures are also considered.…”
Temporal information is an important characteristic of event. It can be used in information retrieval process to organize the returned result. In Chinese, the presentations of time expression are very complex, which make it difficult to both accurately recognize a time expression and precisely connecting it with a given event in a web page that contains multiple events. To address these problems, this paper presents an innovative event time extraction model. Rather than just rely on local context within a web page or a text, this model applies global context provided by all the web pages that had been automatically judged as related with a given event. Our experiment results based on the evaluation criterions show the feasibility of the provided model.
“…According to the TimeML standard, a label needs to contain four attributes: (1) tid, the index of temporal information in the record; (2) Type, the temporal type; (3) Value, the normalized date of a TE; and (4) anchorTimeID, the index of the reference time of the current TE. Each TE has a unique tag (eg, <TIMEX3 tid="t1" Type="TIME" Value="2014-10-11T07:48:16">2014/10/11 7:48:16</TIMEX3>, <TIMEX3 tid="t7" Type="DATE" Value="2014-08-02" anchorTimeID="t2">1 周前</TIMEX3>, <TIMEX3 tid="t4" Type="DURATION" Value="P1Y"> 1年 余</TIMEX3>, <TIMEX3 tid="t13" Type="SET" Value=" [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29]" anchorTimeID="t11"> 第7、9、14 天 </TIMEX3>). These 900 EMRs contain 12,096 TEs (13.44…”
Section: Datasetsmentioning
confidence: 99%
“…Meanwhile, research related to Chinese TE extraction and normalization was reported. Wu et al [26] proposed a temporal parser to extract and standardize Chinese TEs. Zhou et al [27] established a framework concentrating on processing narrative clinical records in Chinese, including a regular expression matching-based method for TE identification and an approach for temporal relationship extraction using CRF.…”
BACKGROUND
Temporal information frequently exists in the representation of the disease progress, prescription, medication, the surgery progress, or discharge summary in narrative clinical text. The accurate extraction and normalization of temporal expressions can positively boost the analysis and understanding of narrative clinical texts so as to promote the clinical research and practice.
OBJECTIVE
The study is to propose a novel approach for extracting and normalizing temporal expressions from Chinese narrative clinical text.
METHODS
TNorm, a rule-based and pattern learning-based approach, has been developed for automatic temporal expression extraction and normalization from unstructured Chinese clinical text data. TNorm consists of three stages: extraction, classification, and normalization. It applies a set of heuristic rules and automatically-generated patterns for temporal expressions identification and extraction of clinical texts. Then, it collects the features of extracted temporal expressions for temporal type prediction and classification by using machine learning algorithms. Finally, the features are combined with the rule-based and a pattern learning-based approach to normalize the extracted temporal expressions.
RESULTS
The evaluation dataset is a set of narrative clinical texts in Chinese containing 1,459 discharge summaries of a domestic Grade-A Class-three hospital. The results present that TNorm, combined with temporal expressions extraction and temporal types prediction, achieves a precision of 0.8491, a recall of 0.8328, and a F1 score of 0.8409 in temporal expressions normalization.
CONCLUSIONS
This study illustrates an automatic approach TNorm that extracts and normalizes temporal expression from Chinese narrative clinical texts. TNorm was evaluated on the basis of discharge summaries and demonstrated its effectiveness on temporal expression normalization with experiment results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.