Research article
Issue: № 3 (129), 2023
Submitted :


In the article, the authors made an attempt to analyze the case of “data” singular and plural forms variability on the academic and professional texts materials used in teaching English in Technical Higher Schools. The choice of analysis object was conditioned by both plural form usage specific rules finding out and the necessity to give information to the teachers about some aspects of variability and standard, though in result of it’s often usage in scientific and technical texts for specific purposes, it always attracts students and postgraduates attention and causes their numerous questions. The got answers were provided by newly appeared process of corpus linguistics, providing the opportunities to consider the various forms.

1. Introduction

Every English teacher has to answer a lot of questions during his or her teaching practice: Can we say so? Can we write and speak similarly? Are there other ways to express one and the same words or word combinations? Why are the non-standard forms used in the credible source? What are the peculiarities in the USA grammar and lexis systems? While answering all the questions, a teacher should differentiate the notions of language “variability” and “standard”.

Despite the fact that many aspects of “variability” and “standard” are being constantly studied, the number of unsolved problems is still presented in language practice.

In the paper, the authors made an attempt to analyze the case of “data” singular and plural forms variability on the academic and professional texts materials used in teaching English in Technical Higher Schools. The choice of analysis object was conditioned by both plural form usage specific rules finding out and the necessity to give information to the teachers about some aspects of variability and standard, though in result of it’s often usage in scientific and technical texts for specific purposes it always attracts students and postgraduates attention and causes their numerous questions. The got answers were provided by newly appeared process of corpus linguistics, providing the opportunities to consider the various forms.

2. Research methods and principles

In modern linguistics there is a stable point of view that many variations exist in the language of synchronic differences, coexistent competitive forms. It is based on language changing processes. Their parallel life creates the variability which can lead to the changing if one of the competitive forms ceased its application and is replaced by other form


Language phenomena registration and analyses which are presented in language system unstable areas, their statistical treatment, finding out the reasons of instability are generally conducted in the frames of a territorial and social variability complex theme in Russian and foreign linguistics


Linguistic (text) corpuses were developed some time ago. Nowadays, they are constantly refilling informative and referential systems and the most important of them are the Corpus of Contemporary American English – COCA

, British National Corpus – (BNC)
, Corpus of Historical American English (COHA) is the most valuable source for diachronic investigations. 

The notion “variability” is closely connected with the notion “standard”. In this paper, the authors consider the language standard definition as “combination of the most resistant, traditional realization of language structures elements, collected and fixed by public language practice”.

While categorizing the main criteria of language phenomena standards definitions, linguistics usually based on “source credibility” and application frequency


No doubt that the frequency degree, spread of some phenomena can be considered as an important sign of its standardization, which is not always sufficient.

The most essential reliable bases for different phenomena to be included into the “standard” category is its presence in rather credible sources, which are mainly referred to authors’ speech in classic and science literature and newspapers materials to a less extent. Unquestionably, one should take into account different phenomena in reference to the codified category in linguistic literature.

The standard permeability is unequal and different on every language level. The most sensitive and vulnerable standards are in the field of lexis and phonetics. The grammar system is more suitable and resistant to language world changing

Divergent process in morphology and syntax are much slower and less intensive than in lexis and phonetics sphere.

The noun is unstable in the area of number category. In research literature materials, the authors have found the most intensive variability of the nouns borrowed from Latin and Greek language together with original languages prefixes. Primarily, they are presented by two brightest manifested themselves processes:

- singular and plural borrowed forms reincarnation;

- borrowed forms including into English language morphological system on account of alienness elimination and approximation to the structures which are typical for English language. 

Language standards deviations can be observed by researches when they have been already spread in speech and belonged to standard, actually.

Disputes about such codified new language elements, their including into standard category eligibility, are hold even after their application is legalized in rather credible grammar references. For example, at present time such disputes subject is Latin plurals form of the world “data” using in the mean of a singular form


3. Discussion

The “data” singular form is output from the scientific sphere in Oxford English Dictionary. It is defined as plural form “datum” which is considered to be plural in historical and special scientific spheres but beyond the limits of this sphere in spite of traditional grammar representatives contradictions, is used as uncountable noun, similar to information, and is coordinated with the verbs in singular form. There is an agreement that the sentences with the words “data was” and “data were” are resented in Standard English.

The discussions about such combinations as “this data, that data, the data supposes” dated from the beginning of the 20th century but due to absence of common opinion between linguists and reporters, it is still going on.

Discussions on the Internet and in outstanding British newspapers held by linguists and reporters showed that there is no unified attitude to “data” usage in the singular form. The British National Statistical Service adheres to the official point of view: "The word data is a plural noun, so write data are". However, the member of Royal Statistical Society I. Garriet contradicts and declares that there is no official point of view: “Statisticians of a certain age and status refer to them as plural noun, but people like me use them in singular”. The reporter of English newspaper “Guardian” D. Marsh, the specialist in language usage states that Latin plural form of “data” is used as s singular form everywhere: “the data are is perceived as hypercorrect, old-fashioned and pompose”


R. Quirk, S. Greenbaum, G. Leech, J. Svartvik noted a great variety of “data” usage as an uncountable noun in the singular form and proved that “datum” is a very rare form in universities grammar textbooks


Interesting data can be received in the result of statistical analyses of the word “data” usage in singular and plural forms performed with the help of BNC, COCA and COHA texts corpuses because it is known that various realizations' relation can be different. Often one of the variants is the main and dominant and the other is secondary. The old-fashioned and newborn variants also can be referred as the secondary one.

The word “data” frequency analysis with the help of BNC and COCA data basis found out that the combination “this data” was met about 50 times and the combination “these data” about 202 times (20/82%) in British English scientific works in comparison with American English ones – 350 times for 1580 (20/85%). These numbers do not let us think that the combination “this data” is dominant.

If one uses these text corpuses for revealing the data affiliation to a singular or a plural form according to a noun and a verb coherence (the choice was between the data suppose and the data supposes), it is found out that “data” is perceived as a singular form only in 9% in British English and 19% in American English respectively.

4. Conclusion

The authors made an attempt to reveal the statistics growth “data” in American English diachronically with the help of CONA corpus, but there were a lot of data discrepancies. The “data” singular and plural forms variability and standard future coexistence is difficult to predict because any language is a mobile system and which variant will be dominant we can see in future.

Having analyzed the specifics of the word “data” the authors came to conclusion that the word "data" being included into dictionaries (that is obligatory condition for codification) and respected linguists' opinion one cannot declare that this variant is highly spread in scientific texts. But at the same time, the authors cannot help noting that the divergent processes' activity in this field though in contrast to lexis and phonetics grammar in the whole and partly morphology are considered to be the most stable and the least subjected to language system changing conditions.

The modern linguistics enables English teaches to observe the language elements functioning thanks to corpus investigations methods emerging and developing.

Article metrics
