The script of the ancient Indus civilization remains undeciphered. The hypothesis that the script encodes language has recently been questioned. Here, we present evidence for the linguistic hypothesis by showing that the script's conditional entropy is closer to those of natural languages than various types of nonlinguistic systems.
Three different types of very intense, quasi-regular X-ray bursts have been observed from the Galactic superluminal X-ray transient source GRS 1915+105 with the Pointed Proportional Counters of the Indian X-ray Astronomy Experiment onboard the Indian satellite IRS-P3. The observations were carried out from 1997 June 12 to June 29 in the energy range of 2−18 keV and revealed the presence of persistent quasi-regular bursts with different structures. Only one of the three types of bursts is regular in occurrence revealing a stable profile over extended durations. The regular bursts have an exponential rise with a time scale of about 7 to 10 s and a sharp linear decay in 2 to 3 s. The X-ray spectrum becomes progressively harder as the burst evolves and it is the hardest near the end of the burst decay. The profile and energetics of the bursts in this black hole candidate source are distinct from both the type I and type II X-ray bursts observed in neutron star sources. We propose that the sharp decay in the observed burst pattern is a signature of the disappearance of matter through the black hole horizon. The regular pattern of the bursts can be produced by material influx into the inner disk due to oscillations in a shock front far away from the compact object.
Although no historical information exists about the Indus civilization (flourished ca. 2600 -1900 B.C.), archaeologists have uncovered about 3,800 short samples of a script that was used throughout the civilization. The script remains undeciphered, despite a large number of attempts and claimed decipherments over the past 80 years. Here, we propose the use of probabilistic models to analyze the structure of the Indus script. The goal is to reveal, through probabilistic analysis, syntactic patterns that could point the way to eventual decipherment. We illustrate the approach using a simple Markov chain model to capture sequential dependencies between signs in the Indus script. The trained model allows new sample texts to be generated, revealing recurring patterns of signs that could potentially form functional subunits of a possible underlying language. The model also provides a quantitative way of testing whether a particular string belongs to the putative language as captured by the Markov model. Application of this test to Indus seals found in Mesopotamia and other sites in West Asia reveals that the script may have been used to express different content in these regions. Finally, we show how missing, ambiguous, or unreadable signs on damaged objects can be filled in with most likely predictions from the model. Taken together, our results indicate that the Indus script exhibits rich synactic structure and the ability to represent diverse content. both of which are suggestive of a linguistic writing system rather than a nonlinguistic symbol system. ancient scripts ͉ archaeology ͉ linguistics ͉ machine learning ͉ statistical analysis
The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilization. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyze the syntax of the Indus script. We find that unigrams follow a Zipf-Mandelbrot distribution. Text beginner and ender distributions are unequal, providing internal evidence for syntax. We see clear evidence of strong bigram correlations and extract significant pairs and triplets using a log-likelihood measure of association. Highly frequent pairs and triplets are not always highly significant. The model performance is evaluated using information-theoretic measures and cross-validation. The model can restore doubtfully read texts with an accuracy of about 75%. We find that a quadrigram Markov chain saturates information theoretic measures against a held-out corpus. Our work forms the basis for the development of a stochastic grammar which may be used to explore the syntax of the Indus script in greater detail.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.