Vol 3, No 2 (2025)
- Year: 2025
- Articles: 7
- URL: https://macrosociolingusictics.ru/MML/issue/view/2117
- DOI: https://doi.org/10.22363/2949-5997-2025-3-2
Full Issue
The Languages of the Peoples of Russian Federation: Digital Documentation Tools and Media Accessibility
Corpus of written Tatar: structure, composition and applications
Abstract
The proposed study is analytical in nature and is devoted to examining and describing the structure, composition, and functional capabilities of the Written Corpus of the Tatar language, which represents one of the largest digital resources for the Turkic languages of the Russian Federation. The study discusses the stages of corpus creation and the methodology of text collection and annotation, including metadata annotation and morphological annotation using the Apertium system. Special attention is paid to applied aspects, such as the integration of the corpus into speech synthesis and speech recognition systems, the development of various linguistic services and its use in educational and research projects. The principles of eliminating duplicate texts are analyzed and prospects for further development are proposed, including the expansion of genre diversity and the introduction of international annotation standards. The material for this study comprises the corpus itself and publications describing the stages of its creation and application. The methodology is presented as a set of empirical methods and techniques, including observation, analysis, description, and testing (of the functional capabilities of the corpus, etc.), as well as the graphical method for visualizing the material under study. The study highlights the scientific and cultural significance of the Written Corpus of the Tatar Language in the context of the digitalization of the languages of the peoples of Russia, which corresponds to the objectives of the International Decade of Indigenous Languages (2022-2032), initiated by the United Nations General Assembly and coordinated by the United Nations Educational, Scientific and Cultural Organization.
97-116
Digital corpus of the linguoculture of the Northern Angara region: structure, composition and applications
Abstract
The development of digital technology has led to the emergence of new and convenient tools for documenting and preserving endangered idioms. These include languages, dialects, and other language variants whose standard forms are highly active. The study presents a comprehensive analysis of the Electronic Text Corpus of the Linguistic Culture of the Northern Angara Region (CLCNA), focusing on its structure, composition, and functional potential. The relevance of this study stems from the need to preserve regional variants of the Russian language and associated cultural practices in a rapidly changing world. The purpose of this study is to describe the corpus as a valuable tool for humanities research, providing a detailed overview of its structure and features. The study relies on data available on the official corpus website, including information about its structure, metadata, and descriptive elements. Using descriptive analysis, corpus linguistics, and content analysis techniques, the author provides a thorough description of the CLCNA’s three-tiered structure, including dialectal, folklore, and multimedia subsets. The findings of this research contribute to a better understanding of the corpus’s capabilities and potential applications in various fields of study.; A description of the unique system of multi-dimensional manual annotation (spatial, temporal, genre, thematic, conceptual, and plot motif); an analysis of the functional capabilities of the online platform for complex search; and an assessment of the scientific significance of the corpus based on previous research in the fields of communicative dialectology, ethnolinguistics, and folklore studies. The study concludes by emphasizing the role of CLCNA as a key resource for preserving the intangible cultural heritage of the Northern Angara Region and its potential in educational, lexicographic, and technological projects.
117-130
Production of audiobooks in the languages of the peoples of Russia using speech synthesizers: problems and prospects
Abstract
The creation of audiobooks in the languages of the peoples of Russia using speech synthesizers is a scientifically and socially significant task. The relevance of the research is driven by the development of speech technologies and state policies supporting linguistic diversity, including in the digital space. The stady examines a standard algorithm for audiobook creation, distinguishing between invariant and language-specific development stages. The study notes that the main difficulties are associated with the stages requiring linguistic adaptation of the text for speech synthesis: annotation and the expansion of abbreviations and acronyms. For low-resource languages, tasks such as segmentation, tokenization, and contextual annotation, including the processing of homographs and specific phonetic features, pose particular challenges. In conclusion, it is argued that full automation of audiobook creation for the languages of Russia’s peoples using current speech synthesis technology is currently unfeasible. Developing audiobooks in such languages requires the prior creation of specialized linguistic resources. A necessary condition is the formation of a parallel corpus of texts and audio recordings produced by native speakers. Therefore, the successful implementation of such projects demands significant preliminary work on compiling training datasets and adapting algorithms to the specific features of each language.
131-145
Language and Culture as a Form of Symbolic Capital
Social and cultural restrictions for movie titles translation: why AI will never do it
Abstract
Numerous studies have been devoted to localising film titles. Most of them focus on adapting the film title to the target language environment from linguistic and cultural points of view. That said, the influence of extralinguistic factors on translating film titles seems to be under-investigated. Meanwhile, various translation forums and cinema websites publish reviews written both by professionals and laymen who discuss possible reasons behind the choice of a film title. Nowadays the issue is being further complicated by the widespread development of AI technologies that lead to questioning the very need for human translation in some fields of knowledge, especially when it comes to small format texts such as a film title. Nevertheless, while localisation of a film through translating its title pertains to the area of marketing expertise, not translation per se, marketing rules have to be observed as well. And current machine translation systems are not equipped to perform these functions now. In this light this study aims to investigate the strategies of adapting English film titles to the Russian market considering the social and cultural aspects of localisation. This study is based on reviews of Russian streaming platforms and translation companies, which share their observations on strategies for translating English-language titles in the context of localising British and American films to the Russian market. This study is analytical and based on reviews of Russian streaming platforms and translation companies, which share their observations on film translation strategies for English films in the context of localizing British and American films for the Russian market. The methodological core of the study consists of empirical methods. The study identified 7 key localisation strategies: expanding the title, using memes, developing the story, adding slang, lowering speech register, adding sexual subtext, minimising the risk of politicising the title, expanding a well-known franchise.
146-165
Sociolinguistic parameters of the image of the USSR in W. Tevis’s Novel “The Queen’s gambit” and its film adaptations: legitimization of dominance
Abstract
The article describes the linguistic means used to create the image of the USSR on an intersemiotic plane, based on the analysis of characters’ speech in W. Tevis’s novel ‘The Queen’s Gambit’ (1984), its Russian translation «Ход королевы» (2020), the novel’s screen adaptation, and its Russian audio-visual adaptation. The research corpus comprises the lines of key characters from the literary work and the film. The lexical material from these lines was categorized into groups: characters’ first and last names, toponyms, chess terminology, and vocabulary describing Soviet realities. Special attention is paid to the role of the Russian language in the English-language original of the novel and to the strategies for conveying the function of Russian speech in the translation of the analyzed works into Russian. The research methodology combines quantitative and qualitative approaches. The quantitative approach utilizes distant reading tools (working with the Sketch Engine corpus manager and its Word List and Keyword tools), while the qualitative approach involves clustering and lexical-semantic analysis of focal lexical units. The analysis revealed key strategies for representing the image of the ‘Other’. In the original novel, the linguistic marking of the Soviet space is achieved through the transliteration of Russian lexemes, creating an effect of cultural distance for the English-speaking audience. In the Russian translation, this effect is neutralized, and the language barrier is conveyed implicitly through authorial remarks. The screen adaptation constructs the image of the USSR primarily through a complex of audio-visual codes (shots of architecture, everyday objects, nonverbal communication, etc.), creating a detailed visual context. However, in the localized series, the original linguistic polyphony is lost, leading to reduced authenticity and semantic simplifications. The study showed that the analyzed image emerges as a dynamic construct formed through the activation of verbal, translational, and visual strategies.
166-191
Symbolic capital and the construction of the “Other”: color as a marker of the image of Russia in American cinema
Abstract
Globalization processes, among other things, aim to unify the semiotic space, leading to the transformation of the structures of language and culture as symbolic capital. One of the most effective instruments of soft power is cinema, whose multimodal semantics offers an effective set of diverse tools for influencing the viewer. Beyond entertainment, films can be used as a tool of «soft power». Existing research on this phenomenon focuses on plot and the complex presentation of events and characters, whose representation is the central element of the film. Studying such representational elements as spoken and written text, costumes, musical scores, framing, and color will help develop the most effective toolkit. The aim of this study is to trace the instrumentalization of color in the formation of the image of Russia in American cinema. American films, selected according to established criteria, served as the material. Conducting the study in two stages, 50 films were examined, with a subsequent, more detailed examination of 30 of them. A corpus analysis of film subtitles revealed instances of references to Russia, Russian people, or elements of Russian culture. Subsequently, it was decided that the visual component needed to be considered when the lines were being spoken, which led to a color study. This revealed that the most frequent use of red in the frame, referencing elements of national symbols, was the color. Functionally, color in most cases merely complements and enhances the meaning conveyed by other instruments, rather than shaping it independently. The results of this study show that color choice is not always determined by aesthetic requirements, but can be a consequence of social and linguistic macroprocesses, behind which lies linguistic and cultural dominance.
192-213
SCIENTIFIC EVENTS. REVIEWS. REVIEWS
214-221





