The ancestry of the Chinese language
Edited by William S-Y. Wang 王士元 主编
William S-Y. Wang 王士元
These pages result from a two-day Symposium held at the City University of Hong Kong2 in July 1994.
The word “Chinese” in the title of this volume refers to the group of dialects, which project back at least 3500 years. For the even earliest texts inscribed on oracle bones and bronze vessels show a relation to modern Chinese lexically, syntactically, and orthographically. On the other hand, diversification into the extant modern dialects probably took place only after China was unified under a central authority, and southward migrations began on massive scales. The terms “Chinese”, “Sinitic” and “Sino-“ are therefore quite appropriate since they all derive from the name of the dynasty, the Qin3, from which time these events took place.
The word “ancestry”, however, is not as clear-cut. Obviously part of its meaning has to do with determining which languages are related to Chinese. However, if we accept the monogenetic view in linguistics, then all the 6000 some languages of the world are related to each other. Our enterprise becomes interesting only if we refine our goal by requiring grouping the languages into monophyletic units.
A group of languages is monophyletic if and only if all its members are maximally related within the group, and no language outside the group has this property. Such groups of languages can be grouped into higher-level monophyletic units, which become increasingly inclusive as we go up in the hierarchy. In this sense, our goal in studying the ancestry of Chinese is reached only when we reach the highest level of such a hierarchy that we can justify.
The familiar tree diagrams which depict these hierarchies constructed on the basis of successive subgroups go back well over a hundred years, when Charles Darwin used them for biological species, and August Schleicher used them for languages. However, very shortly after the use of trees was introduced in linguistics, it was pointed out that languages typically behave in a way, which species typically do not: languages imitate each other when they come into contact. Thus, the wave theory suggests that languages which are closer geographically also tend to be more alike.
Genetic material comes only from parents. A language, to be sure, also has material inherited from earlier states of the same language; but it also has material introduced via imitations of other languages. The similarity we observe between languages typically derives from both sources. To know the true ancestry of languages, we face the problem of how to sort out one form of similarity from the other, the inherited which come from within from the imitations which come from without. Only then can we begin to quantify degrees of relatedness and justify monophyletic units. The problem is an extremely difficult one since it appears that anything in one language can be imitated by another language, though not all with the same level of facility.
Language study in China has a rich tradition, which reaches back some 2000 years, with brilliant achievements to its credit4. However, the mainstream of this scholarship largely remained focused on the texts of the ruling groups, with relatively little attention paid to the languages of the neighboring minorities. I have looked in vain in the early Chinese literature for discussions on the ancestry of the Chinese language.
Early scholarly writings in Europe on the Chinese language have been reviewed by Watters . Thus, we find observations on the Chinese writing system as early as Francis Bacon [1561-1626]5. There were a variety of theories on the ancestry of Chinese, some more fanciful than others. One theory held that the language was “invented all at once by some clever man to establish oral intercourse among the many different nations who inhabited that great country which we call China.” According to Watters [p.4], this brand of special creationism was accepted by no lesser a figure than Leibnitz [1646-1717].
Opposed to this view of Chinese being man-made, there were also theories aimed at relating the language to a biblical scenario. An influential essay by Webb, published in 1669, argued for Chinese to be the first language, spoken in the Garden of Eden. Chinese has also been variously identified as the language of Noah, as well as with each of his sons: Ham, Shem and Japhet. Such fanciful proposals would be more understandable if we recall that it was around this time James Ussher [1581-1656], Archbishop of Armagh, was calculating when the world was created by adding the genealogies given in the Bible.
The dominant views of the 19th century, when evolutionary thinking was already in the air, typically held that the Chinese came from the west, from Mesopotamia, though some preferred Egypt as a source. That the ancestry of Chinese should be traced to Babylonia was put in no uncertain terms by Lacouperie, who was then Professor of Indo-Chinese Philology at University College in London, and president of the Royal Philological Society:
“China has received it[s] language (since altered) … from the colonies of the Ugro-Altaic Bak families who came from Western Asia, … which emanated from Babylonia and was modified in its second focus. This general statement is now beyond any possibility of doubt, for the evidence in its favor is overwhelming.” [Quoted by Watters, 1889:11]
Unfortunately, a century and more later and in spite of Professor Lacouperie’s emphatic assurance, we are still very much in doubt as to how to trace the ancestry of Chinese. The Ural-Altaic hypothesis has never garnered much support. But there was no dearth of alternative hypotheses to choose from, some offered by well-known European scholars. None of these hypotheses can be taken seriously now.
These early investigations had no access to many of the advances and increased sophistication in linguistic methodology that have taken in place in recent decades. Neither could they have foreseen that archeology was to reveal many Neolithic sites in China, which date thousands of years before Bishop Ussher’s Garden of Eden. Perhaps Professor Lacouperie can be excused for his touch of intellectual hubris.
Even more importantly, with these advances there has accumulated a much richer body of data from neighboring languages to compare with Chinese, especially with older stages of reconstructed Chinese. The question is no longer, whether Chinese shares some similarities with language X or language Y, as it was seen during the last century.
Rather, we need to know which of these similarities are due to inheritance and which are due to imitation. The criteria for sorting these similarities one from the other, as I noted earlier, are not yet well understood and far from uniformly accepted. And if indeed the similarities are due to inheritance, the next question is whether Chinese forms a monophyletic unit with X, with Y, or with both [if our tree allows nonbinary branching]. Questions of degree arise here, and Baxter’s paper, which begins Part I of this volume, is an explication of the probabilistic reasoning that must underlie hypotheses of subgrouping.
Indeed, anyone who thinks that we have answers to these questions at present, which are “beyond any possibility of doubt”, will not have read this volume carefully. In Gong’s paper, we have further verification of the Sino-Tibetan hypothesis, with strong support from Tangut evidence that has not been incorporated before. The narrow version of this hypothesis which Gong considers here, including just the Chinese dialects and the Tibeto-Burman languages, is the closest we can come to a consensus at present; but see the comments by Sagart in Part II of this volume.
Attempts to posit higher monophyletic units for Chinese, however, do not as yet command nearly the same degree of consensus. These include the similarities to Indo-European, observed by Pulleyblank, the connections to the North Caucasian and Yenesseian languages, posited by Starostin, and the relation to Austronesian, proposed by Sagart.
These hypotheses are debated by Blust, Li, Pulleyblank, Starosta, and Starostin, mostly in Part II of this volume. This section also includes some remarks from Meacham, who provides a useful archeological perspective.
The papers by Pan, You and Zhengzhang all endorse a wider circle of genetically related languages; besides Chinese and Tibeto-Burman, they would include Miao-Yao, Kam-Tai, Austronesian, and perhaps some other yet unaffiliated languages. They explore different methods to arrive at their groupings, from word families, to animal names, to basic lexicon respectively.
Pan and Zhengzhang use the term “Sino-Austric” or “Hua’ao” to designate this far-ranging phylum they posit. If their hypothesis is correct, however, it is doubtful that Chinese would rank high enough in the tree diagram to warrant being included in the name of the root node. Chinese probably split off from the rest of the tree and came to prominence quit late, several millennia from the root. Its success story is not unlike the great spread of English in Indo-European or Bantu in Niger Kordofanian.
Which of the connections advocated in the following pages will stand the test of future research, and how do they fit into a hierarchy of linguistic groups? Though it is useful to explore these hypotheses, we are far from any definitive answers at present. Yet answers to these linguistic questions are necessary if we are to proceed to the larger interdisciplinary topics of  dating the various stages of linguistic development,  placing the various prehistoric communities on a map, and eventually,  being able to say something about the culture, the society, and perhaps the mentality of peoples who have passed away six, seven, or even eight thousands years ago
This last topic is one where linguistic reconstruction is particularly well situated to make primary contributions, since the vocabulary an ancient people used in their daily lives can offer a fuller and more enlightening view than just their material remains unearthed by the archeologist. In discussing the reconstruction of the Proto-Indo-European words for their gods, Watkins wrote that, “The reconstructed words *deiw-os and *dyeu-peter- alone tell us more about the conceptual world of the Indo-European than a roomful of graven images.” [1985: xvii]
The point here is that the two disciplines have a great deal to offer each other. Clearly, we can achieve a much more complete picture of the past when both sets of data are taken into account, for complementation as well as for cross-validation.
Interest in dating linguistic divergence was stimulated in the 1950’s when Swadesh proposed glottochronology. Inspired by the discovery that physical objects can be dated by measuring chemical elements that have a constant decay rate, Swadesh’s original insight is the analogy that the basic lexicon in any living language is replaced at a relatively constant rate.
Over these decades, a great deal of progress has been made in refining the original ideas. In particular, several additional numerical methods have become available in the form of statistical software, which can be used on the personal computer, as reviewed by Wang . Although these methods were initially developed for purposes of biological systematics, their usefulness to linguistics is obvious and their application straight-forward6.
Until recently, all family trees that have been drawn to subgroup languages do not assign any quantitative value to the branches. The new methods allow us to let the length of each branch represent the duration of independent evolution of the monophyletic unit it dominates. Furthermore, the branches are additive in the sense that the distance between any two languages is represented as the sum of the branches along the shortest path between them. As these methods become more widely investigated in linguistics, and applied to many diverse language groups, we may find Swadesh’s original insight to be largely correct, even though his method is defective. If this turns out to be the case, as I suspect it will, then linguists and archeologists can cross-validate each other on dating as well as cultural interpretations.
Alongside of archeology and linguistics, a third partner for future research on the ancestry of Chinese is genetics. The potential for this collaboration was seen by Darwin when he wrote in Chapter 14 of the Origin of Species: “If we possessed a perfect pedigree of mankind, a genealogical arrangement of the races of man would afford the best classification of the various languages now spoken throughout the world.””
Much more is known now about the extinction, mixing, and replacement of languages, and these processes are certainly all abundantly attested in China.
In particular, assimilation to the language of the Hans, i.e., Sinification, has been a powerful force over the past two millennia. It is clear that all languages within the Chinese sphere of influence have assimilated to Chinese, though to varying degrees. Furthermore, it must also be the case that the Chinese dialects are rife with numerous strata of words which are imitations from neighboring languages over the millennia; this fact is unfortunately obscured by these words all being written in Chinese characters7. These factors all taken together, the correlation between genes and languages can not be nearly as straight-forward as Darwin had envisioned.
Furthermore, to get the gene-language correlation to work at all, it is critical that the appropriate genetic markers be selected. The one comprehensive genetic study of China that has been reported so far [Zhao and Lee, 1989], based on Gm and Km allotypes, actually shows an opposite result. It reveals that the Hans are virtually always closer genetically to their non-Han neighbors than to other Hans further away on the map. In other words, no Han unity can be detected on the bases of these two allotypes.
Nonetheless, the gene-language correlation appears to be more consistent on a global scale [Cavalli-Sforrza et al, 1994]. Here we are dealing with a much grander span of time, since the whole world is included instead of just China, and many more genetic markers were included in the study. Whatever the explanation will eventually be, it seems to me we can learn a lot about the past not by identifying linguistic history with genetic history, but by comparing the two and trying to explain the congruence as well as the incongruence.
The fundamental fact is that children in all societies typically learn their first language from their mothers, from whom they inherit half of their genes as well. And the probability is high that the father speaks the same language since one would not choose a mate one cannot communicate with. So a positive gene-language correlation should be the null hypothesis, following Darwin’s remark. The factors, which reduce this correlation, including the ones mentioned above, are evolutionary events that should be studied in conjunction with the language histories.
If the best attempts at uncovering the ancestry of Chinese, as exemplified by the scholarship contained in the following pages, are seen to fall considerably short of the goal, I would like to think it is because the theory of the past we wish to build is still lacking in crucial data. If nothing else, the Symposium has been invaluable in calling forth a variety of new linguistic data to be examined, digested, and integrated into a synthetic framework
But my perception is that we also need ideas and data from the other areas that are concerned with similar questions. I have mentioned archeology and genetics above, others also readily come to mind. Physical anthropology can examine the fossil remains and tell us something about the characteristics of the speakers themselves8. Comparative ethnography can help us group peoples on the basis of their customs, myths, and beliefs.
We need to be cautious, of course, in drawing inferences across bodies of knowledge – people get displaced and languages get replaced. It would be foolhardy, as an obvious example, to believe that the Liangzhu culture situated in Jiangsu and Zhejiang some 4000 years ago has any direct links with the Wu speakers there today. Nonetheless, the ideas and data from each body of knowledge, judiciously interpreted, can provide a unique window on the past. And the combined view from all these windows allows us to complement and cross-validate our perspective, so that ultimately our knowledge of the ancestry of Chinese can be reconstructed on a broad and secure foundation.
Lastly, it remains to acknowledge the various contributions which made this volume possible. The Symposium was funded in part by a grant from the Chiang Ching-Kuo Foundation for International Scholarly Exchange. Benjamin T’sou was most gracious and effective in providing logistic support from the host institution at which the symposium was held.
I thank the twelve authors for sharing their expert knowledge with us in this volume. Most regrettably, You Rujie of Fudan University and Zhengzhang Shangfang of the Institute of Linguistics in Beijing were unable to join the Symposium, due to circumstances beyond their control; their papers were transmitted through the courtesy of Pan Wuyun of the Shanghai Normal University. No doubt the Symposium would have been richer still had they been able to come.
Weera Ostapirat had the responsibility of turning the Symposium papers into a unified volume. The task of reconciling the numerous differences in fonts, scripts, and spelling conventions is a daunting one. Our guideline is to get the monograph published as accurately and as quickly as possible, while the memory of the Symposium is still fresh, even at the cost of a rougher appearance. Thanks to Weera’s tireless dedication, and to the assistance of Shen Rongqiu of JCL and of Shi Feng of Nankai University, this monograph has become a reality.
My hope is that the ideas and the data presented in these pages will do much toward defining the state of the art for research on this question, and toward stimulating and guiding future work. Clearly, the ancestry of Chinese is a question of central importance for any theory of linguistic evolution, for human prehistory in general, and for the genesis of Chinese civilization in particular. Perhaps the next Symposium on this question will broaden the base of discussion by including other disciplines as well.
更强的同源性 … 比偶然可能产生的：古汉语及藏缅语族概率性比较
William H. Baxter 白一平
In recent years there has been increasing interest in the possibility of tracing distant language relationships. The discussion has even reached the popular press, which usually pays little attention to linguistics. One attraction of hypotheses about language relationships is that they might tell us something about human population movements in prehistoric times – something we otherwise have relatively little evidence about. Some of the bolder recent proposals for distant linguistic relationships include the putative families Nostratic (Illič-Svityč 1971-1984, Dolgopolsky 1964) and Amerind (Greenberg 1987). As for proposals involving Chinese, the association of Chinese and Tibeto-Burman in a Sino-Tibetan family, though widely accepted, is not uncontested; the position of Thai and related languages is still debated (traditionally part of Sino-Tibetan, but assigned by Paul Benedict to Austro-Thai, along with Austronesian). Sergei Starostin (1982, 1984, 1991b) has argued for a Sino-Caucasian family including Sino-Tibetan, Yeniseian, and North Caucasian, and his colleague Nikolaev (1989) has further proposed a Dene-Caucasian family including Starostin’s Sino-Caucasian and the Na-Dene family of North America. (this proposal ties in with Greenberg’s Amerind hypothesis, according to which Na-Dene and Eskimo-Aleut represent more recent incursions from Asia into the Americas, compared with the older Amerind family incorporating All other native American languages.) More recently, Laurent Sagart (1993) has given arguments for a genetic relationship between Chinese and the Austronesian family.
The discussion of these hypotheses among linguists has been understandably lively, but at times also surprisingly acrimonious: proposals which some linguists regard as established scientific fact are dismissed by others as irresponsible nonsense. The source of the problem, I believe, is the lack of consensus among linguists about how hypotheses like these can be evaluated objectively. In the absence of such consensus, the attitudes of individual linguists towards a controversial hypothesis often reflect more about their individual temperaments, and the habits of their respective academic microcultures, than about the evidence for or against any particular hypothesis.
This paper attempts to remedy this situation by proposing objective methods for evaluating hypotheses about remote linguistic relationships, and to illustrate these methods by applying them to the hypothesis that Chinese and the Tibeto-Burman languages are genetically related. The fundamental assumption of this approach is that when two languages show phonological correspondences in their lexicons which are too great to attribute to chance, this fact calls for some explanation. The only plausible explanations other than a genetic relationship are (1) lexical borrowing and (2) a non-arbitrary relationships between sound and meaning. If these two explanations cannot account for the correspondences, then a genetic relationship is the only plausible explanation left. This may be the case even if no ancestral proto-language has actually been reconstructed, and if the phonological histories of the languages are still poorly understood.
The use of greater-than-chance correspondences as a criterion for genetic relatedness is at least as old as Sir William Jones’s famous observation in 1786 that Sanskrit. Greek, and Latin show a stronger affinity, both in the roots of verbs and in the forms of grammar, than could have been produced by accident; so strong that no philologer could examine Sanskrit, Greek, and Latin, without believing them to have sprung from some common source, which perhaps, no longer exists. (Quoted in Ruhlen 1994:12; emphasis added)
If this approach is to move beyond relying on the intuition of the ‘philologer’, it is crucial that one be able to decide on principled grounds how likely it really is that the correspondences observes are the result of chance. The theory of probability theory to test possible genetic relationship is by no means new, but this approach is still not widely understood or used. The approach described here is essentially the same as that of Justeson and Stephens (1980). Possibly novel aspects of this study include (1) the emphasis on using a fully explicit algorithm, which can be implemented on a computer, to identify phonological matches; (2) using a computer to actually simulate repeated random trials; and (3) applying the method to Chinese and Tibeto-Burman.
The approach used here can be summarized as follows:
1. Parallel word lists are independently assembled for the languages being compared: in this study I use a 35-item list (of which I exclude two) compiled by S. E. Jaxontov; from this list are constructed parallel word lists for Old Chinese and Tibeto-Burman. These lists are specified and discussed more fully below.
2. An algorithm is constructed which will decide whether any particular pair of words will be counted as a phonological match. (An algorithm is an explicit procedure which can be applied mechanically by a computer, and which always gives an answer.)
3. This algorithm is used to count how many of the items on the list match when they are paired according to their meaning. We may call this the observed score; it will be compared with the scores obtained when items are paired at random.
4. With the aid of a computer, a large number of random trials are performed in which one of the lists is mechanically scrambled and then matched against the other list, using the same algorithm. The computer counts and remembers how many phonological matches are obtained on each trial. The proportion of random trials whose scores are as high as the observed score is an estimate of the probability that the observed score could have occurred by chance. If this probability is below a certain level, then the phonological matches observed are judged too numerous to ‘have been produced by accident’. In some cases it is also relatively easy to estimate this probability by a mathematical formula.
It should be kept in mind that the observed score has no significance in itself; it will tend to be higher or lower depending on the strictness of the criteria for a phonological match. Seven matches in a test with very strict criteria may be strong evidence of a relationship, while fifteen matches on another test may not be significant at all. The observed score is like the raw score on a standardized test: observed scores from different tests are standardized by reference to the probability in each particular test of getting so high a score by chance.
Section 2 describes in some detail two experiments I conducted in this fashion on word lists from Old Chinese and reconstructed Tibeto-Burman (as presented in Benedict 1972). Section 3 gives some further comments and cautions about how to design such experiments; section 4 discusses the interpretation of results.
2. THE EXPERIMENT
2.1 Word lists
2.2 An algorithm for Old Chinese-Tibeto-Burman correspondences
3. HOW TO CHEAT
3.1 Stacking the deck
3.2 Inflating the score
3.3 But I didn’t cheat much…
4. INTERPRETING THE RESULTS
4.1 Interpreting a positive result: getting lucky
4.2 Interpreting a negative result
The purpose of this paper is to set up the system of finals in Proto-Sino-Tibetan (PST) on the basis of a comparison of four classical languages in this family, i.e. Old Chinese (OC) as reconstructed by Li (1971), Written Tibetan (WT), Written Burmese (WB), and Tangut as reconstructed in Gong (1993b). In this paper I have reviewed the cognates proposed in my earlier study (Gong 1980), eliminated those which have turned out to be untenable, and incorporated some new ones proposed by myself (Gong 1990, 1991) as well as by other scholars in the field, especially Bodman (1980, 1985), Luce (1981), Coblin (1986), Starostin (1989), and Yu (1989). I have also included in this paper my own studies of ST cognates in Tangut and my recent findings of some new ST cognates.
Tangut is an extinct ST language with literature dating from the 12th century. It is very important for the reconstruction of Proto-Tibeto-Burman (PTB) as well as of PST phonology, because it retains the medial /-j-/ of PST which is supposed to have been lost in WT (in all environments except before the high front vowel /-i-/) and WB. Tangut rhyme tables contain 105 rhymes and, for this reason, Tangut has been regarded as having a very complex vowel system. This presents problems for my previous hypothesis that PST had a simple vowel system of four vowels like that of OC. However, after a series of studies on the phenomena of vowel alternations in Tangut (Gong 1988, 1989, 1993a, 1994a and 1994c), I have managed to establish a simple system of seven vowels to account for the 105 rhymes in the Tangut rhyme books. In the present work I will try to outline how this seven vowel system developed out of the PST four vowel system.
3.1 The development of vowels from PST to WT
3.1.1 The sources of WT /o/
3.1.2 The sources of WT /e/
3.1.3 The sources of WT /u/
3.2 The development of vowels from PST to WB
3.3 The development of vowels from PST to Tangut
4. FINAL CONSONANTS
4.1 Homorganic consonant alternations in OC and PST
4.2 Internal reconstruction of lost finals in OC
4.3 Dental final consonants
4.4 Velar final consonants
4.5 Labial final consonants
5. CONCLUDING REMARKS
All human languages may ultimately be genetically related if we believe the generally accepted theory that mankind originated in or near Ethiopia, Africa about three million years ago. Anthropologists tend to believe that human language evolved long after the first human, “the son of Lucy,” came into being. “He has a protruding jaw, thick brow ridges and a braincase so small it leaves no doubt that our ancestors learned to walk long before they mastered complex thought” (Begley 1994). Human language may have evolved hundreds of thousands of years ago, but linguists are only successful in reconstructing the history of a language over shorter spans of a few thousand years.
Any pair of languages in the world can be demonstrated to be related in one way or another. It is a matter of how strictly we adhere to the rigorous comparative method, as generally adopted by historical linguists. As Yuen-ren Chao commented on a master’s thesis (Wang 1927) in regard to genetic relationship between languages, “It is easy to say there is one, but hard to say there isn’t.”
2. ON THE AUSTRO-CHINESE HYPOTHESIS
3. A COMPARISON
3.2 Natural Phenomena and Objects
3.3 Body Parts
3.6 Kinship and Personal Relations
3.10 Cultural Items
近年来，不少学者讨论了传统汉藏语系中的语言与南岛语的关系，如Benedict(1944)提出台语与南岛语组成一个新的语系 – 澳泰语系，Sagart(1993)则指出汉语与南岛语的发生学上的关系，但是并不把侗台语包括在内。把侗台语排除在传统的汉藏语之外，这几乎是近年来西方学者一致认识。这主要有两个原因。一是同源词的择对有问题，如侗台语的”鸟”应与汉语的”骛”比较，如与汉语的鸟进行比较，当然会得出不同源的结论;一是汉语和侗台语的原始形式构拟有误，如果知道”翼”的原始形式是b lmk，那么傣语的”翅膀” pik作为汉语的同源词自然是没有问题的。如果排除这方面的错误，侗台语中与汉语同源的基本词汇就相当多了。
关于侗台语与汉语的发生学上的关系，笔者另文讨论。本文只是在肯定侗台语与汉语同源的前提下，检讨西方学者关于东南亚语言关系方面新学说的意义。如果确认侗台语与汉语的同源关系，同时又接受Benedict关于澳泰语的假说，那必然会导致南岛语与汉语同源的结论。几乎很少有学者提到南亚语的同源关系，如果再承认侗台语与汉语同源，自然也会得出汉语与南亚语同源的结论。所以，在研究东南亚语言亲缘关系的时候，侗台语几乎是起着桥梁的作用。正是基于这种认识，郑张尚芳认为汉语﹑藏缅语﹑侗台语﹑苗瑶语﹑南亚语和南岛语组成一个大语系 – 华澳语系。本文支持华澳语系的假说。
i(脂) w(之) u(幽)
e(支) a(鱼) o(侯)
1. The Principle of Regularity in Sound Change
2. Diachronic Study of Language as a Window on Prehistory
3. Language Contact – Another Window into Unrecorded History and Prehistory
II. Reconstruction of Old Chinese – The Essential Foundation
1. Karlgren’s yod
2. Type A and Type B Syllables
3. The Old Chinese Vowel System
4. / ə～a/ Ablaut as a Morphological Feature in OC
5. Initial Consonants
6. OC Sources of Middle Chinese Initial j-.
III. Chinese (Sino-Tibetan) and Indo-European
1. Previous attempts to show a relationship between Indo-European and Chinese or Indo-European and Sino-Tibetan
2. Archaeological Evidence for the Possibility of Early Contacts between Proto-Sino-Tibetans and Proto-Indo-Europeans
3. Proto-Indo-European and Proto-Sino-Tibetan Phonology and Morphology
IV. Selected Cognates
1. IN. INSIDE
2. WHAT? WHO?
3. JOIN. FIT; CLOSE; COVER; SEIZE, HOLD. HAVE
4. EYE, FACE
6. COW, CATTLE
7. GO, COME
9. COUNTRY TERRITORY
13. SMALL, YOUNG
20. SIT, SET, WEST, NEST
22. NAME, MOON, MONTH
1. CHINESE AND TIBETO-BURMAN (TB).
1.1. Some cons:
1.1.1. The sound correspondences
1.1.2. The borrowing issue
1.1.3. Correspondences of the borrowed layer
1.2. Some Pros
2. NON-LINGUISTIC ASPECTS
Sergei [Sergai] A. Starostin 斯塔罗斯金
In this paper I would like to discuss a rather important methodological problem: does historical linguistics possess an objective procedure of evaluating proposed hypotheses concerning genetic relationship of language?
The procedure that I propose below, is the following:
a) to prove that two (or more) languages or linguistic families are related, we must know the set of regular phonetic correspondences connecting those languages. Otherwise any discussion is futile (all proposed equations may be due to chance). This is the standard demand of comparative linguistics.
b) the languages (or linguistic families) compared should share a significant part of basic vocabulary, and the items compared should match each other according to the set of correspondences demonstrated during the step a). This is also a common demand, but it is usually much less clear than the first one. What is basic vocabulary? What part of it is significant? I dare to propose here a test that appears (at least in my experience) to work in all cases of established genetic relationship.
As a rather quick way to test the results of comparison we may take the list of 35 most stable meanings proposed by S.Y. Yakhontov. They include the following (in English alphabet order): “blood, bone, die, dog, ear, egg, eye, fire, fish, full, give, hand, horn, I , know, louse, moon, name, new, nose, one, salt, stone, sun, tail, this, thou, tongue, tooth, two, water, what, who, wind, year’. Actually, the stability of some items in Yakhontov’s list raises doubts (this concerns, e.g., the items ‘one’ and ‘this’). We could easily choose some other list, but this one has an advantage of being already tested on a great many linguistic families of the world. The compared items should completely match semantically (i.e., correlations like ‘fire’:’hot’ or ‘water’: ‘flow’ are not taken into account – in order to exclude discussion of the semantic plausibility of comparison).
I maintain, that in all known cases of established genetic relationship this test yields following results:
a) closely related languages (like Slavic or Germanic) have about 30 or more related items within the 35-wordlist.
b) more distantly related languages (on the level of Indo-European) have more than 15 related items within the 35 wordlist. To establish the precise nature of relationship (in order to distinguish, e.g., the Balto-Slavic level from the Indo-European level) we have to resort to other, more precise, statistical methods.
c) if the compared languages have from 5 to 15 related items within the 35-wordlist, it means that we can suppose a still more distant relationship between them. The precise nature of relationship is difficult to establish (it may be very archaic, like Nostratic, or somewhat more close, like Uralic or Altaic; other statistical methods should be used to obtain more precise results in cases like that).
d) if the languages compared have less than 5 common items in the 35-wordlist, it means either that they are not related at all (and the existing common items must be explained by pure chance or by borrowing), or that the common words may be in fact the ‘Proto-World’ heritage – if one believes in monogenesis. We will not discuss the latter hypothesis here: obviously, if one proposes a theory of genetically relating two languages, he implies that they are more closely related to each other than to all other languages of mankind.
Rujie You 游汝杰
Shangfang Zhengzhang 郑张尚芳
0. THE POINT OF DEPARTURE.
I have been asked to comment briefly on the SinoAustronesian (SAN) hypothesis of Laurent Sagart . The units of comparison with which Sagart works are Old Chinese (OC), “circa 800-500 BC, a language closely related to, if not directly descended from , the language of the Shang inscription” (Sagart 1993:3), and various reconstructed stages of AN. In Sagart (1993) the latter are called “reconstructed Austronesian” (RAN), a cover term for proto-languages ranging from PAN (circa 4000 BC) to Proto-Malayo-Polynesian (PMP, circa 3500 BC) to Proto-Western Malayo-Polynesian (PWMP, circa 3000 BC) (Blust 1984/85). In order to reduce the probability that the proposed etymologies are products of chance, Sagart (1994) restricts his comparison to OC and PAN. Since I am evaluating the evidence for the SAN hypothesis as a whole my comments will necessarily range across both publications, without always distinguishing between them.
Perhaps the first remark worth making is that Sagart derives Chinese from PAN or various lower-order AN proto-languages (PMP, PWMP, etc.). The effect of this procedure is to disarm possible criticism of his startling hypothesis from the AN side: so long as the literature is cited accurately – and in this respect Sagart cannot be faulted – there can be no quibbling with his reconstructions from the very Austronesianists who have proposed them. At the same time one must not lose sight of the historical implications of such a comparative procedure. If all essential features of Old Chinese (OC), including consonant series, vowels, tone, and some lexical alternations suggestive of earlier morphology can be derived from unmodified forms of existing AN reconstructions, we are forced to the even more startling conclusion that Chinese is not a sister branch of AN, but rather a geographically displaced and typologically aberrant member of the AN language family itself.
Because Sagart essentially equates SAN and AN, and treats the reconstructed AN material in a responsible and competent manner, the burden of falsifying the SAN hypothesis must fall most heavily on the shoulders of those who are professionally qualified (as I am not) to evaluate his treatment of the Chinese material. For this reason I will limit myself to comment briefly on some general features of the SAN hypothesis which I believe are worthy of critical consideration.
1. SINO-AUSTRONESIAN MORPHOLOGY
4. THE SINO-AUSTRONESIAN LEXICON.
5. THE PROBLEM OF COMPETING HYPOTHESES
6. THE HOMELAND PROBLEM
7. THE EVIDENCE OF PHYSICAL ANTHROPOLOGY
Ever since Gordon Childe, prehistorians have rightly focused on the rise of agriculture as the principal transformation of human subsistence patterns and the most important single event in prehistory. However, many have questioned the nature of this event, particularly whether it should be viewed as the “Neolithic Revolution” Childe had written of or whether it should be more properly described as a process. In a paper at the multi-disciplinary conference on “The Origins of Chinese Civilization” at Berkeley, California in 1978, the respected botanist Huilin Li (1983:21) wrote:
The idea of a Neolithic Revolution, implying a sudden and dramatic change in human history, is misleading. Evidence has been accumulated to show that the transition from food gathering to food producing was very gradual…A single point of origin, a zone including Anatolia, Iran and Syria was once believed to have given rise to plant domestication…From there, agriculture was supposed to have spread to other parts of the world. Now, however, independent origins in many different parts of the world are considered probable.
The issues of suddenness or gradualism, of one or many centers, and of why and how agriculture spread, are still of course open and subject to much debate. As will become evident below, the view one takes on these issues will to some extent determine the receptivity that one might have to the grand linguistic scenarios currently being generated for prehistory.
My own view is one of concurrence with Li that the process was gradual and that it involved many different independent origins and small incremental advances, at least in the first stages. Further, it seems likely to me that although the cultivation of plants was certainly a highly attractive proposition in the incipient phase, the rise of agriculture was not an immediate “success story” but a long and frequently arduous struggle to find and maintain a system that would reliably produce the means for survival and population growth in each ecological, topographic and climatological niche where plant cultivation was attempted. Perhaps it was an irony of nature, or a trick played upon us by the earth gods, but hunting-gathering by small bands was probably more reliable and required less labour than food-production (Sahlins 1974). But once the path to food production had been well and truly taken it was increasingly difficult to give it up. A more sedentary lifestyle and more mouths to feed were undoubtedly among the factors that led to advances in agricultural technology, which in turn led to population increase and further specialization within the econiche, etc. The process of agricultural development must often have been precarious; in the Early Neolithic and at the continuing sometimes amorphous interface between animal husbandry/plant cultivation and hunting-gathering there must have been many faltering steps and failures.
Some of the processes of agricultural development and especially the diffusion of agriculture to adjacent peoples will be briefly examined below for the East Asian context. These issues are crucial to the various hypotheses of early language spread and replacement linked to agriculture dispersal.
THE NUCLEAR AREA – AGAIN!
THE DEEP RECONSTRUCTIONS
AN EARLY AGRICULTURAL TRANSITION IN NEW GUINEA
AGRICULTURE AND LANGUAGE – LINKED IN DISPERSAL?”
Edwin.G Pulleyblank 蒲立本
RESPONSE TO L. SAGART’S, “SOME REMARKS ON THE ANCESTRY OF CHINESE”
Excursus on presyllables in Chinese
Are lexical correspondences between Chinese and Tibeto-Burman the result of borrowing?
RESPONSE TO STAROSTIN’S PROPOSAL FOR A CONNECTION BETWEEN SINO-TIBETAN, YENISEIAN AND NORTH CAUCASIAN
Laurent Sagart 沙加尔
COMMENTS ON W. BAXTER’S ARTICLE “’A STRONGER AFFINITY… THAN COULD HAVE BEEN PRODUCED BY ACCIDENT’: A PROBABILISTIC COMPARISON OF OLD CHINESE AND TIBETO-BURMAN”
1. PROBLEMS WITH YAKHONTOV’S WORD LIST
1.1. Circularity in testing the Chinese-TB relationship on the basis of a list compiled at least in part on the basis of the Chinese-TB shared vocabulary
1.2. Bias in favor of nominals
1.3. Borrowings on Yakhontov’s list
2. THE OC AND PTB RECONSTRUCTIONS FOR EACH MEANING WERE NOT EVOLVED INDEPENDENTLY
2.1. Baxter’s OC reconstructions
2.2. Selection of TB cognate sets
3. BAXTER’S SOUND CORRESPONDENCES HAVE ‘LOANISH’ FEATURES
3.1. Reflection of tones, aspiration, and central vowels in borrowing by a language that lacks these features
4. THE SUPPORTING MATERIAL CONTAINS CHINESE LOANWORDS
COMMENTS ON STAROSTIN’S ARTICLE “OLD CHINESE BASIC VOCABULARY : A HISTORICAL PERSPECTIVE”
1. STAROSTIN’S LEXICAL COMPARISONS
1.1. blood. 血 *hmit (<-ik) > hwit, PNC (starostin) *hwěɁnV.
1.2. dog. 犬 *khw[i,e]nɁ = PNC (Starostin) *ϫHwěje.
1.3. ear. 耳 *njiŋɁ > ńźji? = PNC (Starostin) *ʕwănʕV.
1.4. give. 予, 與 *ljaɁ = PEC (Starostin) * -íʟV (diacritic omitted).
1.5. horn. 角 *krok = PNC (Starostin) *qw rhV.
1.6. moon. 月 *ŋwjat = PNC (Starostin) *wəmcŏ
1.7. new. 新 *sjin < -ŋ = PNC (Starostin) *cănɁV
1.8. one. 一 Ɂjit = PNC (Starostin) *cHə.
1.9. salt. 鹺 *dzar : PNC (Starostin) *cwěnhV.
1.10. tail. 尾 *mjijɁ = PEC *mēʁV (short final vowel).
1.11. tongue. 舌 *Ljat = PNC (Starostin) *mělci
1.12. two. 二 *njij-s = PNC (Starostin) *năwši.
1.13. year. 年 *nin < -ŋ PNC (Starostin) *śwänĭ
2. SOME REMARKS ON THE METHODOLOGY
2.1. Phonetic matches
2.2. Semantic matches
2.3. Morphology and word-families.
2.5. Comparative procedure.
2.6. Extra-linguistic evidence.
COMMENTS ON P. LI JEN-KUI’S ARTICLE “IS CHINESE GENETICALLY RELATED TO AUSTRONESIAN?”
1. CHANCE SIMILARITIES AND LOOSE SEMANTIC EQUATIONS.
2. LACK OF BASIC VOCABULARY: NO NUMERALS, NO PRONOUNS, NO ITEMS FOR NATURAL PHENOMENA.
3. WORD STRUCTURE.
4. USE OF LATE AND LOW-LEVEL FORMS
5. COMPARING THE CHINESE AND AUSTRONESIAN MORPHOLOGIES
6. AUSTRONESIAN LOANWORDS IN CHINESE AND TIBETO-BURMAN?
7. Li’s article ends on a list comparing “Proto-Austronesian and Sino-Tibetan basic and important vocabulary”, which purports to show that Chinese shares many more cognates with Tibeto-Burman than do PAN and OC. Due to shortage of time and space I will not comment on the selection of items in that list.
Stanley Starosta 师德乐
1. THE SINO-TIBETAN-AUSTRONESIAN HYPOTHESIS
2. SUBGROUPING CONSIDERATIONS
3. MORPHOLOGICAL RECONSTRUCTION
3.1 Lexical versus morphological evidence
4. CANONICAL WORD SHAPE
4.1 σ1σ2 σ2 and the propagation of initial-syllable morphology
4.2 σ2 σ1σ2 and the propagation of initial-syllable morphology
4.2.1 The compounding mechanism
4.2.2 Affixation mechanism
4.3 A compromise
5. FUTURE STRATEGY
Sergei A. Starostin 斯塔罗斯金
RESPONSE TO L. SAGART’S “SOME REMARKS ON THE ANCESTRY OF CHINESE
1. THE SOUND CORRESPONDENCES
2. THE BORROWING ISSUE
3. PHONOLOGICAL CRITERIA