CODES (Developmental Corpus)

June 1, 2017 | Autor: Mário Martins | Categoria: Language Development, Corpus Linguistics, Language and Literacy Development
Share Embed


Descrição do Produto

Author: Mário Martins (Universidade Federal do Amapá – UNIFAP (Brazil) Collaboration: Amália Mendes Centro de Linguística da Universidade de Lisboa – CLUL (Portugal)

CODES can be accessed at CLUL's CQPWeb platform: http://alfclul.clul.ul.pt/CQPweb/codes2016/ CODES is a developmental corpus of texts written by school age children and adolescents, monolingual speakers of European Portuguese. The texts were collected between September 2011 and January 2012 by Mário Martins, to his PhD thesis, entitled Complexidade textual e progressão escolar em dois registos: um estudo de correlação baseado em um corpus quasi-longitudinal (Universidade de Lisboa) – available here: http://repositorio.ul.pt/bitstream/10451/23963/1/ulsd072796_td_ Mario_Martins.pdf Designed as a quasi-longitudinal corpus, CODES is composed of 244 texts of narrative (n = 122) and argumentative (n = 122) registers. The subjects (51% female and 49% male) are students in the 5th (n = 26; mean age = 10,19), the 7th (n = 46; mean age = 12,33) and 10th (n = 50; mean age = 15,16) grades from four different public schools of the Portuguese basic schooling system. Each student wrote one narrative and one argumentative text, from two different writing tasks, as described below: Narrative task: Narrate a remarkable event (a real or a imagined one) that you and your best friend lived during the last summer vacations.

(Narra um facto marcante (real ou imaginado) que tu e teu(tua) melhor amigo(a) viveram durante o último verão.)

Argumentative task: Do you think social networks (Facebook, Twitter, Google+, Windows Live Space, etc.) are important today? Write a text to be published on the school blog where you express your opinion on social networks. In this text, you should say whether you are for or against the existence of social networks. Do not forget to justify your opinion! (Achas que as redes sociais (Facebook, Twitter, Google+, Windows Live Space, etc.) são importantes hoje em dia? Escreve um texto para ser publicado no blogue da tua escola em que exponhas a tua opinião sobre as redes sociais. Neste texto deves dizer se és a favor ou contra a existência das redes sociais. Não te esqueças de justificar a tua opinião!)

The student’ parents were informed about the general objective of the research - a correlational study between textual complexity measures and school progression. The parents have signed a consent term. CODES was automatically part-of-speech tagged using memorybased tagger (MBT) (Daelemans et al., 1996), with a tag set of 80 different tags. It was automatically lemmatized using a Portuguese version of the MBLEM lemmatizer (van den Bosch and Daelemans, 1999). CODES has 48 000 words, approximately.

Cite the corpus as follows: Martins, Mário. 2015. CODES. Centro de Linguística da Universidade de Lisboa (CLUL). Available in: http://alfclul.clul.ul.pt/CQPnet/ For more information about CODES, please contact Mário Martins ([email protected]).

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.