Linguistic corpora of understudied languages: Do they make sense?

A corpus of an understudied language usually has documentary-linguistic nature and comprises all text material available in a particular language. However, without resorting to text selection, it is impossible to obtain a representative and balanced sample of language use. Lack of these two characte...

Descripción completa

Detalles Bibliográficos
Autor principal: Vinogradov, Igor
Formato: Online
Idioma:spa
Publicado: Universidad de Costa Rica. Campus Rodrigo Facio. Sitio web: https://www.ucr.ac.cr/ Teléfono: (506) 2511-4000. Correo de soporte: revistas@ucr.ac.cr 2016
Acceso en línea:https://revistas.ucr.ac.cr/index.php/kanina/article/view/24143
id KANINA24143
record_format ojs
spelling KANINA241432022-05-31T02:52:53Z Linguistic corpora of understudied languages: Do they make sense? Vinogradov, Igor corpus linguistics understudied languages language documentation quantitative methods A corpus of an understudied language usually has documentary-linguistic nature and comprises all text material available in a particular language. However, without resorting to text selection, it is impossible to obtain a representative and balanced sample of language use. Lack of these two characteristics makes a corpus almost useless for any kind of quantitative research. Nevertheless, corpora of understudied languages comply with a wide range of language documentation objectives. Furthermore, they can serve as evidence of the existence of word forms or grammatical features in texts that meet specific search criteria. If such corpora have well-elaborated linguistic annotation, they can complement grammatical descriptions and dictionaries, standing out against common text collections due to their digital format. They are especially suitable for typological research, when one has to deal with a huge amount of data in different and unrelated languages.  Universidad de Costa Rica. Campus Rodrigo Facio. Sitio web: https://www.ucr.ac.cr/ Teléfono: (506) 2511-4000. Correo de soporte: revistas@ucr.ac.cr 2016-05-03 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Article Article application/pdf https://revistas.ucr.ac.cr/index.php/kanina/article/view/24143 10.15517/rk.v40i1.24143 Káñina; Vol. 40 No. 1 (2016): Káñina (January-June); 127-141 Káñina; Vol. 40 Núm. 1 (2016): Káñina (Enero-Junio); 127-141 Káñina; Vol. 40 N.º 1 (2016): Káñina (Enero-Junio); 127-141 2215-2636 0378-0473 spa https://revistas.ucr.ac.cr/index.php/kanina/article/view/24143/26095 Derechos de autor 2016 Káñina
institution Universidad de Costa Rica
collection Káñina
language spa
format Online
author Vinogradov, Igor
spellingShingle Vinogradov, Igor
Linguistic corpora of understudied languages: Do they make sense?
author_facet Vinogradov, Igor
author_sort Vinogradov, Igor
description A corpus of an understudied language usually has documentary-linguistic nature and comprises all text material available in a particular language. However, without resorting to text selection, it is impossible to obtain a representative and balanced sample of language use. Lack of these two characteristics makes a corpus almost useless for any kind of quantitative research. Nevertheless, corpora of understudied languages comply with a wide range of language documentation objectives. Furthermore, they can serve as evidence of the existence of word forms or grammatical features in texts that meet specific search criteria. If such corpora have well-elaborated linguistic annotation, they can complement grammatical descriptions and dictionaries, standing out against common text collections due to their digital format. They are especially suitable for typological research, when one has to deal with a huge amount of data in different and unrelated languages. 
title Linguistic corpora of understudied languages: Do they make sense?
title_short Linguistic corpora of understudied languages: Do they make sense?
title_full Linguistic corpora of understudied languages: Do they make sense?
title_fullStr Linguistic corpora of understudied languages: Do they make sense?
title_full_unstemmed Linguistic corpora of understudied languages: Do they make sense?
title_sort linguistic corpora of understudied languages: do they make sense?
publisher Universidad de Costa Rica. Campus Rodrigo Facio. Sitio web: https://www.ucr.ac.cr/ Teléfono: (506) 2511-4000. Correo de soporte: revistas@ucr.ac.cr
publishDate 2016
url https://revistas.ucr.ac.cr/index.php/kanina/article/view/24143
work_keys_str_mv AT vinogradovigor linguisticcorporaofunderstudiedlanguagesdotheymakesense
_version_ 1810112798611472384