A considerable amount of time and effort is spent compiling drilling log databases for use by geological consultant companies, public research institutes and universities.However, it has been often found that some of these databases contain duplicate information (Shibuya et.al., 2003). The primary cause of this is that the original paper base reports, from which the computerized databases are compiled, are stored in more than one place. The database editors collect those paper base reports from various organizations and consultants. The collected reports contain plural copied reports. In particular, the original paper logs often do not accurately record the numerically correct geographical position (longitude and latitude) of the drilling points. More over, we go out in several parties to input those data, and the inputted title of the reports are not same format perfectly. Especially, we use two kinds of character such as 1 byte character and 2 byte character in Japanese word processor. Those back ground causes the data duplication in drillinglog database.In some cases more than 10% of the drilling logs contained in the databases (City Eng. and Const. Dep. of Muroran City Office, 1998) were found to be duplicates. This problem is difficult to avoid even if a lot of work is spent on the detection and deletion of duplicate logs. This is because it is difficult to compare vast quantities of information from different locations by visual inspection alone.Previously cluster analysis was utilized to try and alleviate this problem. However, this technique itself is problematic, as it requires large amounts of computer memory and high-speed processors. Comparison of about 3,000 drilling logs is probably the limit on a personal computer. Furthermore, complex visual inspection is still needed after cluster analysis in order to completely eradicate the duplicates.In this paper we discuss an alternative solution to the problem using Fourier-type spectra of rectangular waveform functions. METHOD Numerical treatmentsWe assign numerical values for the soil names in the drilling logs, and then sample the numerical value at equal inter- Geoinformatics, vol.18, no.2, pp.55-60, 2007 Hideyasu ASAHI*, Kunio KAWAUCHI **, Toshikazu KUROSHIMA*, Mitihiro OOTA*** and Shinji HIRAI ** Abstract : Duplicate information is often found in soil drilling log databases. In this paper, we propose a method to detect this duplicated information in order to compile a reliable database.The detections are determined by the following method:1) Each soil name in the soil log is assigned a specific number.2) This numerized log is then transformed to the spectra of Walsh-Hadamard function.3) Subsequently, direct component and first small number of spectra (approximately 4 or 5) in low order are sorted using a computer spreadsheet program such as Lotus 123 or MS Excel.4) From the spreadsheets, duplicated drilling logs is adjoined and can be readily identified.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.