Tetsuya Nakatoh scite author profile

Search results generated by searchable databases are served dynamically and far larger than the static documents on the Web. These results pages have been referred to as the Deep Web [1]. We need to extract the target data in results pages to integrate them on different searchable databases. We propose a testbed for information extraction from search results. We chose 100 databases randomly from 114,540 pages with search forms. Therefore, these databases have a good variety. We selected 51 databases which include URLs in a results page and manually identify target information to be extracted. We also suggest evaluation measures for comparing extraction methods and methods for extending the target data.

show abstract

Search and Analysis of Bankruptcy Cause by Classification Network

Hirokawa

Baba

Nakatoh

2011

View full text Add to dashboard Cite

Plagiarism detection using document similarity based on distributed representation

Baba

Nakatoh

Minami

2017

Procedia Computer Science

View full text Add to dashboard Cite

Automated Generation of Coding Rules: Text-Mining Approach to ISO 26000

Nakatoh

Uchida

Ishita

et al. 2016

View full text Add to dashboard Cite

Extraction of Tourist Behavior Contexts from Blog by Verbs and Their Objects

Nakatoh

Hirokawa

2012

View full text Add to dashboard Cite

Blog articles by tourists contain interesting and personal experiences of where and how they have gone, what they have done and what they thought. Such individual experiences are helpful in many cases compared to the general and official information about the tourist resort by tourist agents. However, it is not easy to choose related articles and to extract still more nearly required information from these unsorted blog articles. This paper proposes a technique of feature extraction by dependency analysis of verbs and objects in those sentences that describe tourist's behavior. This paper applied the method to 7,917,385 blog articles on Kyushu area and reports some analysis on "where and what did they eat" as case studies.

show abstract

Research Trends with Cross Tabulation Search Engine

Yin

Hirokawa

Yau

et al. 2013

View full text Add to dashboard Cite

To help researchers in building a knowledge foundation of their research fields which could be a time-consuming process, the authors have developed a Cross Tabulation Search Engine (CTSE). Its purpose is to assist researchers in 1) conducting research surveys, 2) efficiently and effectively retrieving information (such as important researchers, research groups, keywords), and also 3) providing analytical information relating to past and current research trends in a particular field. Their CTSE system employs data-processing technologies and emphasizes the use of a “Learn by Searching” learning strategy to support students to analyze such research trends. To show the effectiveness of CTSE, a pilot experiment has been conducted, where participants were assigned to do research survey tasks and then answer a questionnaire regarding the effectiveness and usability of the system. The results showed that the system has been helpful to students in conducting research surveys, and the research trend transitions that our system presented were effective for producing research trend surveys. Moreover, the results showed that most students had favorable attitudes toward the usage and usability of the system, and those students were satisfied in gaining more know ledge in a particular research field in a short period.

show abstract

Focused Citation Count: A Combined Measure of Relevancy and Quality

Nakatoh

Nakanishi

Baba

et al. 2015

View full text Add to dashboard Cite

Text mining of tourism preference in a multilingual site

Zeng

Nakatoh

Hirokawa

et al. 2018

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

There is a huge demand on multilingual tourism information of Japan because of the increasing number of tourists from foreign countries. Most of them may expect typical and stereotyped culture, nature, and modern society of Japan. However, people from different backgrounds, cultures, and languages might expect different aspects of Japan, as well. In this paper, we analyze these kinds of differences as the cultural tourism preference for Japan. We propose a machine-learning-based method to figure out the cultural tourism preference of people of different countries based on comparing the access logs to a multilingual tourism information site in different languages. We focus our discussion on the pages accessed in Thai and Vietnamese languages. Our research result shows that for Thai tourists the characteristic features are the famous places in an area and local specialties, but Vietnamese tourists pay much more attention to facilities and location of hotels. This difference was not observable by naive extraction of keywords and their visualization. This result has been used as a guide to the further creation of content in the tourism information site.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tetsuya Nakatoh

Testbed for information extraction from deep web

Search and Analysis of Bankruptcy Cause by Classification Network

Plagiarism detection using document similarity based on distributed representation

Automated Generation of Coding Rules: Text-Mining Approach to ISO 26000

Extraction of Tourist Behavior Contexts from Blog by Verbs and Their Objects

Research Trends with Cross Tabulation Search Engine

Focused Citation Count: A Combined Measure of Relevancy and Quality

Text mining of tourism preference in a multilingual site

Contact Info

Product

Resources

About