Start page

Valentyn Stetsyuk (Lviv, Ukraine)

Personal web site


Language Substratum

The study of the substrate plays an important role in the restoration of prehistoric ethnogenetic processes. However, in this study there should be no false conjecture about the original places of settlement of speakers of the languages being studied, otherwise, their history will turn out to be distorted. For example, if we assume that the Albanian language is a continuation of a certain Paleo-Balkan language and some of its words are considered Balkan substratum in other languages of the same region, then their history will already be distorted (LINDSTEST JOUKO, 2014, 168-183). On the other hand, the ancestral homeland of the Thracians, whose linguistic ancestors are Albanians, was located in the ethno-producing area in the upper reaches of the Oka and the Desna Rivers. This fact is confirmed by the lexical correspondences between the Albanian and the languages of Mordvins, whose ancestral home was located next to the Thracian on the opposite bank of the Oka River. Such correspondences will have far-fetched explanations if linguists have the ancestral homeland of Albanians in the Balkans. It is clear that the scientific value of such works will be negligible.

We have already considered some examples of the substrate influences the most revealing of which may be a large number of common vocabulary in the Chuvash and German languages (see the section "Chuvash-Germanic Language Connections"). Here we will try to bring the phenomenon of language substrate into a certain system.

Many languages have substrate words, that are present also in etymological tables used for the construction of relation models. However, they may not meet the phonetic laws considered related languages but, nevertheless, may undertake the inverse relationship between the number of common words in related languages and the distance between the areas of their formation. This is because the substrate words could be taken from related languages of the previous population having completely different phonology, so the reflection of substrate words in the languages of the alien population could not meet laws formed in them. However, because the configuration of areas remains the same, the perceived substrate fits into the same law of inverse relationship.

The language influence of substratum can have syntactic, phonetic, and lexical features. As V.I.Abaev asserted, "the substrate influences can reveal themselves (especially in phonetics and in syntax) a long time after ethnic substrate environment has disappeared or was dissolved long ago and its language has ceased existing on this territory” (ABAYEV V.I. 1965: 48). The most persistent is the phonetic substrate, as evidenced by the existence of retroflex consonants in India and adjacent territories of Iran (ibid). This is due to the existence of a genetic basis for articulation. However, if the gene pool of a population changes due to migrations, then the phonetic substrate may be a result of imitation. In this regard, when studying the influence of the substrate, it is necessary to use historical and anthropological data, which is not always possible, as can be seen in the considered example below.

One of the main phonetic features of South-Eastern Europe is the replacement of the stop g by the fricative γ and pharyngeal h, ɦ, observed in the Ukrainian, Czech, Slovak, Upper-Lusatian, Belarus languages, in a southern dialect of Russian, in the western dialects of Slovene and partly in the Serbo-Croatian dialects. Generally, we will denote these sounds by h, as is the case in Czech and Slovak. For the first time seriously the question of the reasons and time of these changes was put by Russian scientist N.Trubetskoj in the work “Zur Entwicklung der Gutturale in den slawischen Sprachen” (Sofija, 1933). He has supported the Proto-Slaic roots of the first phase of transition g to γ and saw the reason for it in the phonologic system of the Proto-Slavic language. However there is rather impressive evidence that stop consonant g was kept in the Czech and Slovak languages probable up to XII century (KOMÁREK MIROSLAV, 1983: 37-47). The basic argument confirming this opinion is the fact, that only in XII century did the letter h start to appear on a place g in the Czech annals, and the borrowed toponymic names from Slovak in Hungarian have sometimes the phoneme g. Nevertheless, V.I.Abaev found it possible to explain transition g in γ (h), which is characteristic for Armenian and Phrygian, by the influence of the Iranian languages (ABAEV V.I., 1965: 44-51). Probably the tendency to the specified transition had the latent character in the Czech and Slovak languages or chroniclers kept up Slavic spelling rather punctually, and the Slovak toponymic has been earlier borrowed from German or Celtic and kept primary g during first time.

We can look for another explanation by looking at the full range of those languages (and among them Czech and Slovak), for which this transition is characteristic. You can pay attention to the fact that the ancestors of the native speakers of these languages are concentrated in the basin of the tributaries of Pripyat. This suggests that the power source of this phonological phenomenon was here. It can be assumed that this was the influence of the Scythian (Proto-Bulgarian) on the languages of neighboring ethnic groups because the modern Chuvash language has no stop consonant g. The Old Chuvash language had the voiced velar fricative γ which systematically replaced the ancient Turkic q (it is displayed by the letter x now which in intervocalic position is pronounced as Ukrainian γ.)

It should be noted that the Greek voiced fricative γ stands in the place of the Indo-European voiced stops g, gu, ĝ (Krahe Hans, 1966, 91), though the sound g appeared (or kept?) in New Greek on the place of voiceless stop k in position after n (LOPASHOV Yu.A., 1990, 33). It can be assumed that the trend of replacement of g by γ appeared among the ancient Greeks on their ancestral home, located between Lower Pripyat and Berezina rivers. Such a phenomenon is absent in Latin, and the Greeks could come into contact with the Bulgars only after the Italics left their ancestral homeland.

Another phonetic substratum can be a hypothetical sound rz which has been considered above. The phenomenon of rhotacism, h.e. replacement of the phoneme z (s) by phoneme r, known in the Latin language from IV century AD, took also place in some West-German languages (EGOROV V.G., 1971: 25). Obviously, there was a sound rz in these languages, as well as in Turkic also was, which has passed in usual r later. The Czech language is keeping it till the present time. Some phonetic facts say also about the existence of the sound rz in the Ukrainian language: Ukr. žerst’- Rus. žest’ “tin-plate”. This word is loaned from the Turkic languages where it has the sense “copper, a brass” and exists in the forms jes, zes, zis, etc. The Ukrainian phoneme r remains not clear, it is explained usually as arising under the influence of the word šerst’ “wool”, h.e. unpersuasive (VASMER MAX, 1964-1974, MELINYCHUK O.S., The editor, 1982- 2004. The borrowing of the Ukrainian word from one of the Turkic languages can give the explanation for the presence of the sound rz in Ukrainian, if the borrowing has taken place at the time when the sound rz still existed in some of the Turkic language. That is, the Turkic proto-form can be restored as *zerz. Then, if the parallel form *zelz existed, it is possible to explain also Old Slavic *zelzo "iron" of not clear etymology. The Ukrainian word stands phonetically most closely to Turkic *zelz, but it is not clear, from which Turkic language it was borrowed. Nothing, except for çěrě "ring" (from zerze), is found in the Chuvash language, but this word by the meaning stands rather far. There is one more very interesting example confirming the existence of the phoneme rz. The Latin cursarius is widespread in many languages in the form of corsar “a pirate’. Such parallels as Chuv xarsăr "courageous", Qarach, Balk ğursuz "malicious", Tur hırsız "a thief" and other Turkic words can be considered as matches to this Latin word. Ukr xarcyz, xarcyzáka “a robber” is borrowed from some of the Turkic language, but there is still Polish harcerz "a scout", which reflects the spelling of that phoneme rz. Usually, the etymology of the word corsar is explained to be descended from Latin curare "to run", but, undoubtedly, Turkic words stand much near by meaning. The Latin, Polish and Ukrainian words have been loaned from Turkic at various times, and Latin word was got by many European languages later.

The fact, that the important phonetic features of the two branches of the Slavdom have no precise frontiers says about their deep substratum character too. The conservative phonetic phenomena closely become attached to a certain territory, but because of their extremely long existence have dim frontiers. Proceeding from this, the transition gv, kv > cv, zv can have those roots as well as division "centum-satem", primarily caused by influence of the Finnish languages. The Finnish substratum has very influenced the Russian southern dialect moving its speakers up to the banks of the Volga. This influence explains a-, c-speaking and some grammatical phenomena in the spoken Russian language (BIRNBAUM HENRIK, 1990: 8).

Many lexical correspondences in the languages of different language groups, which have been revealed during the carried research, also can be explained by the influences of the substratum. We shall consider the most convincing of them according to ethno-producing areas which were populated by the primary Indo-European tribes lived at first, farther by the German and Iranian ones, then by the Balts, and the Slavs as the last. The Baltic tribes in these areas have been assimilated by Slavs, therefore we have no idea about their language, but the substratum vocabulary has been transferred partly by them to Slavs who have stratified on them. We shall begin the consideration with the western areas.

1. The western area between the Vistula, the Narev, the Yaselda, and the Upper Pripyat on both banks of the Western Bug consists actually of two areas, the frontier between which is the Western Bug. Obviously, according to the accepted concept, languages which were formed in this area had always two primary dialects. We have authentic data not about all languages of this area, but this assumption is confirmed by the Sorbian language of Lusatian Serbs, which is actually shared on two separate languages – Upper- and Low-Lusatian. The Dutch and Frisian languages were formed in this area before and Proto-Celtic earlier did. In processing „The Historian-Etymological Dictionary of Upper- and Low- Lusatian languages” (SCHUSTER-SCHEWC H., 1976), some words, which have no matches in the other Slavic languages, have been found. Two of them have conformity in the French language: Lus. bakut "snipe (bird)" – Fr bekot "snipe", Lus. barliś "to chat" – Fr parler "to speak". Having no explanation, Schuster-Schewc supposed with great doubt the borrowing of the Lusatian words from French. Most likely both words go back to the Proto-Celtic language. A.Dauzat deducted Fr bekot from Lat beccus (DAUZAT Albert, 1930), and in the Etymological Dictionary of the Latin Language (WALDE A., 1965) last word is noted as "Gal (Gallic)". When this area was occupied by Germanic peoples after the Celts, the Dutch and Frisian languages started to be formed here. As the Celtic word has got in the Lusatian language, it has to be present as well in Dutch or Frisian. No match was found in both of them, but the word bek "beak" is present in Dutch and in Friesian. They would be borrowed from the French bek, which is also considered to be of Celtic origin (VEEN van, P.A.F.; SIJS van der, NIKOLINE, 1997). As it is known, the snipe has a long beak which takes its name. The connection of Fr parler with Celtic is not found, as A. Dauzat considered this word to be the derivative from Lat. parabolare which is borrowed from Greek. But Lus. barliś can be put in correspondence with Dutch pralen "to brag" and brallen "to shout". A similar word is also present in the German language. F. Kluge noted it as of obscure origin, most likely as onomatopoeic. It is specified in the etymological dictionary of the Dutch language (Ibid) that the Dutch word is borrowed from German: Low Ger pralen “to speak much”. Thus, one can suppose that the root *parl/pral "to speak" was in local use in the next areas still from the Celtic times.

2. The area between the Upper Western Bug and the Sluch (the right tributary of the Pripyat) was occupied by Illyrians at first, then by the ancestors of modern Germans (we shall name them farther conditionally "Teutons"), and later by the Czechs. The Illyrian language is unknown, and the Czech language has many German loan words from different times, therefore the revelation of the lexical substratum is complicated and can be the aim of special research.

3. It is also difficult to discover the substrate vocabulary in the area of the Proto-Germanic tribes between the rivers Neman, the Yaselda, the Pripyat, and the Sluch (the left tributary of the Pripyat). Later this area was occupied by the Goths, and after them, the country was settled by the Poles. We do have not enough Gothic vocabulary for the discovery of the substratum phenomena, and the available words have almost always parallels in German.

4. The Slavic Urheimat on the banks of the river Viliya has been populated with Slavs almost all the time, therefore the concept of the substratum is absent in this area.

5. The phenomena of the Baltic substratum should be shown in the area of Belorussians, but it is too difficult to discover it because of large amounts of the loanwords in the Belorussian language from Lithuanian which cannot be confidently divided by layers.

6. The area between the Berezina and the Dnepr was occupied by Tocharians, later the Eastern Balt settled here, and after them, the speaker of the Northern Russian dialect. Because of the scarcity of the Tocharian vocabulary, both the long neighborhood of Balts and Slavs, the research of substratum phenomena here is a very much complicated target.

7. The Ukrainian language was formed in the same place as the Greek and Norse in the area between the Low Berezina, the Low Pripyat, and the Sluch (the left tributary of the Pripyat). Currently, this area is occupied by the Belorussians, therefore the Greek substratum can be found in the Belorussian language too. The ancient substrate word of this area is *krene “source, well”, though the majority of scholars deny a connection between Ukr krynyc’a, Blr krynica, and Gr κρηνη – all “a source”. However, the experts [FRISK H., 1970; HOLTHAUSEN F., 1974] see the match for Gr. κρηνη and κρουνός "source, tide, flow, jet" in Icl. hrønn “wave” and OE hræn ”wave, tide, sea”. Obviously, the Norsemen have adopted this word from the remnants of the Proto-Greek population of this area, and it has been got from them by the ancestors of Ukrainians who have added to the word the Slavic suffix –yc’a. Then the word in the form krynyc’a, has been borrowed from the Ukrainians by the Belorussians, Russians, and Poles. Other Greek-Ukrainian-North-Germanic correspondences, which can be considered as substrate words:

Gr βλεμμα ”eye, look” – Icl blim-skakka “to be squint-eyed” (Icl skakka “curve”) – Ukr blymaty “to sea, blink”.

Gr γλεπω “I see” – Sw glippa, Dan glippe “to see, blink”, Ukr hlypaty “to look”.

Icl. glop-r “an idiot” – there is the root glup "silly" in many Slavic languages. The word is absent in other Germanic languages.

гр. κωβιοσ “gudgeon” (Gobio gobio) – Icl. kobbi “young seal”, Sw. kobbe “seal” – Ukr. kovbyk, Blr. kovbel “gudgeon”. There are similar words in the Baltic and Russian languages (A. LAUCHUTE Yu.A., 1982: 143), but they are phonetically farther away. Maybe this is a wandering word.

Icl. köstr “a pile”, "a pile of fuel" – Ukr. koster, Pol. kostra“a pile”, "a pile of fuel", Rus. koster "fire, fireside" a.o. Slavic.

Icl. smuga “a narrow cleft” – Ukr. smuga "stripe".

Gr χαρισ “beauty” – Old Norse hannr "skilled" – Ukr harnyi “beautiful”.

Gr. σκαπερδα “ game of Greek youths with a stick during dionysias” (FRISK H. 1970) – Ukr skopyrdyn “some game, during it a stock is thrown to be struck on the earth with both ends in turn” (VASMER M., 1971: 649). Both games are significantly different, so the similarity in the names seems mysterious, while the Ukrainian is well explained by skopyrdyn Icl. skoppa “bounce” and jörðin “earth”.

8. The area between the rivers Teterev, Sluch (the right tributary of the Pripyat), the Pripyat was occupied in turn by ancient Italics, Anglo-Saxes, and Slovaks. There are a lot of borrowings from Latin in English, and among those borrowings would have to be a substrate, and although it is rather difficult to separate them, some examples can be found:

Lat merus “only” – Eng merely – Slvk mirn( “only”.

Lat pellare “to beat, drive” – Eng pelt – Slvk pelat’ “to drive”.

Lat faecare “to dirty” – Eng feculence – Slvk fakat’ “to dirty”.

Lat valles valley” – Eng valley – Slvk valov “manger, trough”.

Lat sulcus “furrow” – Old Eng sulh “furrow”.

Lat collis “hill”- Eng hyll.

Lat currare “to run” – Slvk kurit’ “to drive”.

9. The area between Desna, Sula, and the Dnieper rivers was first settled by the Proto-Armenians, later those Iranians settled here, whose language developed into Afghan, and the Slovene language began to form here later. One can find a lot of matches between the Iranian and Armenian languages. Mostly such Armenian words were borrowed from Persian. It is difficult to find a possible Armenian substratum in the vocabulary of the Afghan language, but it can be found in verbal word formation, which was very unstable at that time. (GORNUNG B.V., 1963: 47). In both Armenian and Afghan languages, the infinitive is formed by the ending –al, el, added to the word stem. For example, Arm. horēl "to dig", Afg. xərəl "to dig, excavate". The same word may also be a lexical substrate. In his etymological dictionary of the Slovene language, F. Bezlaj considers it possible to connect Slvn. bek (the old form bъkъ) “furnace” with Arm boc "fire" and bosor "red", thinking that this word cannot be Slavic by origin and cannot be loaned from vulgar Latin (Lat focus "stove") for the phonologic reasons. However, it is not clear for him “in what way this word has come to the Southern Slavs” (BEZLAJ FRANCE, 1976: 16). The Armenian bosor can be connected to Afg busar “smoldering ashes” which stays isolated among the Iranian languages. Thus, the way of borrowing Slovenian words was from Proto-Armenian substratum through Proto-Afghani in the language of the latest Slavic population of this area. Bezlaj connects also Slvn bed, the Serb. bêd "air" with Persian bаd "air", "wind". The word of this root in sense "wind" is present in many Iranian languages including Pashto, therefore, it is not excluded, that Slavic words can be considered as the Iranian substratum. The word lopta/lapta (Slvn, Serb lopta, Rus. lapta etc.) is present in many Slavic languages, having sense “ball”, “a game with a ball ”, or seldom „a stick for beating a ball”. M. Vasmer deduced this word from lopata "shovel", and this explanation is repeated in the Etymological dictionary of the Ukrainian language (VASMER MAX, 1964-1974, MELINYCHUK O.S., The editor, 1982-2004). However, two reasons let us doubt such etymology. At first, the sense „a stick for beating a ball” can be found only in Russian and Belarussian languages, this word matters "ball", "sphere", and "lump" in other Slavic languages. Second, the Macedonian form of the word is lopka. The word lap/lop/lob in value "ball", "cheek", and “something convex” is dispersed in the Iranian languages. These facts give us the reason to conjecture that the word lopta had the primary form lop-ka (-ka – is the Slavic suffix) and that it is the Iranian substratum in Russian and South Slavic languages. Some other South Slavic words can be considered as Iranian substratum (TRUBACHEV O.N., 1965).

10. The area between the Desna and the Iput’ has been settled in turn by ancient Phrygians, Sogdians (modern Yagnobians), and ancestors of Serbs and Croatians. The Snov River distinctly divides this area into two halves, therefore the division of the primary Slavic tribe into Serbs and Croatians could take place in the ancient homeland. However, the substrate phenomena are difficult to be found as we almost do not have the lexical material of the Phrygian language, and the available Yagnobian is very scarce. Nevertheless, the substrate influences of the Proto-Sogdian language on Serbo-Croatian can be confirmed by such parallels: Serb budija "turkey" – Yagn búdina "quail", Serb. buva "fly" – Yagn buvva "flea", the Serb. kulaš “ dun horse” – Yagn kulo "dun horse", Serb. kuћа "a hut" – Yagn kuč "family", Serb. čuka "sheep" – Yagn šok "a ram".

There are also some Serbo-Croatian words that have coincidences in the other Iranian languages: Serb badža "tot" – common Ir bača "boy", Serb. kurija “ room, a rent ” – Yagn kroî "to cost", Kurd kerin "to buy". Separately, by way of a hypothesis, one can about a possible connection between isolated among the Iranian languages Yagn gajk "daughter" with Slavic words of type gajka “screw-nut”. Vasmer represents it with a mark “the difficult word”. It has more wide sense in the Serbian language than in East Slavic languages, namely – 1) "a nut, female screw"; 2) “a ring as an aim framework ”; 3) “a mobile ring on bridle”, therefore the word can be the most ancient in this language and occur from the Iranian substratum if to take into account analogy: Germ Mutter 1) "the mother"; 2) "a nut, female screw".

11. The Urheimat of the Indo-Aryans was in the basin of the Sozh, between the Upper Dnepr and the Iput’ Rivers, later here Proto-Ossetian language began to be formed, and later the Southern dialect of the Russian language did. V.I.Abaev gives many Indian-Ossetic and Ossetic-Russian matches in his dictionary (ABAEV V.I., 1958-1989) but it is difficult to distinguish the Ossetic substratum in the Russian language, due to the linguistic contacts between Russians and Ossetians in later times. We can speak about it with confidence only when we are the Indian-Ossetic-Russian triad. An indisputable Indian substratum in Russian can only be a word that has no correspondence in other Indo-European, and from all Iranian languages, it has a match only in Ossetian. There are almost no such matches in all three languages, but the following cases can be considered:

OInd. ksap "to ruin, destroy" – Osset. safyn 1. “lose”, 2. "ruin" ;

OInd. pakşa “side”, “wing” – Osset. fax “side"– Let. paksis “a corner of a house ” – Rus. dial. paksha “the left hand”;

OInd. palāša- “leaves” – Osset. bälas “tree”;

OInd. anga “a member of the body ” – Osset. ong “a member of the body ”;

OInd. čhāga- “goat” – Osset. säğ “goat”;

OInd. stubh “to make noise” – Osset. stuf “noise, a sound";

OInd. stukā “wisp of hair ” – Osset. styg “wisp, curls”.

Russian words similar to Ossetian ones may not be of substratum origin, but borrowed from the Moksha language, into which they could have gotten either from Proto-Ossetic, or have a common source of borrowing with it. For example Osset. syxsy "stone berry" has the match in Rus. dial. shiksha “some kind of berry ", but there is Moksha. shukshtoru “currant” (toru is another root). Thus, the issue of the Indian and Ossetic substratum in the Russian language requires a special study.

12. The extreme east area of the whole Indo-European territory on the watershed of the tributaries of the Dnepr and the Volga was the area forming the Thracian language. Then Proto-Kurds were separated from the Iranian community and then Balts settled here, and after them, those Slavs, who became ancestors of nowadays Bulgarians, lived here. We know very little about the Thracian language, and possible lexical coincidences between it and Bulgarian can be related already to the Balkan substratum. Thus, searching for only Bulgarian-Kurdish lexical parallels can give us some results. Bulg. bagazáj "matchmaker" is marked as "not clear" in the etymological dictionary of the Bulgarian language (GEORGIEV V.L., GЪLЪBOV Iv.,1971). This word can be explained on the Kurdish basis: Kurd. bava "father" and zava "son-in-law", having zayin "to give birth" and zoy "son" in Pashto. The origin of the obscure name of some plant bozlan can be explained with the help of Kurd boz "grey" and lam "leaf". Bulgarian xubav "good" can be corresponded with Kurd. xob "good" though this word is the common Iranian.

Many words of Iranian origin, dispersed in the Slavic languages, cannot be told with confidence, in what way and during what time they have got to them.