Speech and Language Resources

For the development of products and applications in Linguistic Technology it is necessary to have basic linguistic resources (textual and oral corpus, lexicons and knowledge bases) and development tools (morphological and syntactic analysers, meaning disambiguators, corpus treatment tools, lemmatisers, integrated tool environments, etc.).

We have more than 25 years of experience in the creation of this type of basic linguistic resources and we have different reference corpus, lexicons ...Read More

see more

data_tabs

Demos

Konbitzul

Izen+aditz konbinazio-itzulpenen datu-basea

e-ROLda

A tool for looking up verb entries in the BVI lexicon and examples in EPEC-RolSem corpus

Universal Dependencies treebank for Basque

This treebank has 121 K words annotated following the guidelines proposed in the Universal Dependencies project.

 

Contracts

Projects

Patents

Eusemcor

Corpus tagged with Basque WordNet senses.

Basque WordNet / Euskal WordNet

Basque WordNet

EDBL

Basque lexical database.

EPEC-ROLSEM

Corpus tagged with semantic roles.

EPEC-DEP (BDT)

A syntactic corpus tagged using the Dependency Grammar Theory.

Resources

Publications

Arantxa Otegi, Aitor Agirre, Jon Ander Campos, Aitor Soroa, Eneko Agirre

Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque (2020)

Proceedings of The 12th Language Resources and Evaluation Conference, pp. 429–435. European Language Resources Association. ISBN: 979-10-95546-34-4

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Towards Word Sense Disambiguation by Reasoning (2020)

Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340

Uxoa Iñurrieta

Identification and translation of verb+noun multiword expressions: a Spanish-Basque study (2020)

Procesamiento del Lenguaje Natural, 64, pp. 123-126.

Kepa Bengoetxea, Itziar Gonzalez-Dios, Amaia Aguirregoitia

AzterTest: Open source linguistic and stylistic analysis tool (2020)

Procesamiento del Lenguaje Natural, 64, 61-68. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6196

Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre

Give your Text Representation Models some Love: the Case for Basque (2020)

Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf

Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre

DoQA - Accessing Domain-Specific FAQs via Conversational QA (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7302–7314

Itziar Gonzalez-Dios, Javier Álvez, German Rigau

Towards a Model for Ontologising WordNet Adjectives (2020)

Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 1–6. ISBN: 979-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf

Jon Alkorta, Itziar Gonzalez-Dios

Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon (2020)

Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 20–24. ISBN: 79-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf

Thierry Declerck, Itziar Gonzalez-Dios, German Rigau (editors)

Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMWN-2020) (2020)

European Language Resources Association (ELRA), Paris. https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf ISBN: 979-10-95546-41-2 EAN: 9791095546412

Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza

EusTimeML: A mark-up language for temporal information in Basque (2020)

Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06

Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau

Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)

Language Resources and Evaluation Conference (LREC 2020)

Jon Alkorta, Koldo Gojenola, Mikel Iruskieta

SentiTegi: building a semantic oriented Basque lexicon (2019)

Computación y Sistemas, 22 (4)

Igone Zabala

The elaboration of Basque in academic and professional domains. (2019)

In Grenoble, Lenore; Lane, Pia & Røyneland, Unn Unn Røyneland (ed.) Linguistic Minorities in Europe Online. The Gruyter Mouton. ISSN 2510-5361

Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Mikel Iruskieta

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool (2019)

PLoS ONE 14(9): e0221639

Ander Soraluze, Olatz Arregi, Xabier Arregi, Arantza Diaz de Ilarraza

EUSKOR: End-to-end coreference resolution system for Basque (2019)

PLoS ONE 14(9): e0221801. https://doi.org/10.1371/journal.pone.0221801

Ainara Estarrona, Izaskun Etxeberria, Ander Soraluze, Manuel Padilla-Moyano

Spelling Normalisation of Basque Historical Texts (2019)

Procesamiento del Lenguaje Natural, vol. 63, pp. 59-66

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Commonsense Reasoning Using WordNet and SUMO: a Detailed Analysis (2019)

Proceedings of the Tenth Global Wordnet Conference, pp 197--205. ISBN 978-83-7493-108-3

ItziarGonzalez-Dios, German Rigau

Textual genre based approach to use wordnets in language-for-specific-purpose classroom as dictionary (2019)

Proceedings of the Tenth Global Wordnet Conference, pp 222--227. ISBN 978-83-7493-108-3

Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre

Conversational QA for FAQs (2019)

NeurIPS 3rd Conversational AI Workshop: “Today's Practice and Tomorrow's Potential”

Begoña Altuna, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza

Adapting TimeML to Basque: Event Annotation (2018)

In Gelbukh A. (eds.) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science (LNCS, vol 9624), 565-577. Springer, Cham. DOI https://doi.org/10.1007/978-3-319-75487-1_43; Print ISBN 978-3-319-75486-4; Online ISBN 978-3-319-75487-1

Uxoa Iñurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola

Konbitzul: an MWE-specific Database for Spanish-Basque (2018)

Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan. orrialdeak: pages 2500-2504.

Uxoa Iñurrieta, Itziar Aduriz, Ainara Estarrona, Itziar Gonzalez-Dios, Antton Gurrutxaga, Ruben Urizar, Iñaki Alegria

Verbal Multiword Expressions in Basque corpora (2018)

In the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (at COLING 2018)

Igone Zabala

Euskararen terminologiaren garapena Terminologiaren Teoria Komunikatiboaren argitan (2018)

In Ruben Urizar eta Itizar Aduriz (ed.) Hizkuntzalari Euskaldunen III Topaketa. Zer berri?. 349-358.

Klara Ceberio, Itziar Aduriz, Arantza Díaz de Ilarraza and Ines Garzia-Azkoaga

Coreferential Relations in Basque: The Annotation Process (2018)

J Psycholinguist Res (2018) 47, Issue 2. Pages 325-342. https://doi.org/10.1007/s10936-018-9559-6. ISSN 0090-6905. Online ISSN 1573-6555.

Izaskun Aldezabal, Xabier Artola, Arantza Diaz De Ilarraza, Itziar Gonzalez-Dios, Gorka Labaka, German Rigau and Ruben Urizar

Basque e-lexicographic resources: linguistic basis, development, and future perspectives (2018)
file2
(2018)

Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence. https://lexdhai.insight-centre.org/Lex_DH__AI_2018_paper_5.pdf

Ainara Estarrona, Izaskun Aldezabal, Arantza Díaz de Ilarraza

How the corpus-based Basque Verb Index lexicon was built (2018)

Language Resources and Evaluation. First Online 05 December 2018. DOI: https://doi.org/10.1007/s10579-018-9440-0. Springer Netherlands

Itziar Aduriz, María Jesús Aranzabe, José María Arriola, Arantza Díaz de Ilarraza, Itziar Gonzalez-Dios, Ruben Urizar

Building the Gold Standard for the Surface Syntax of Basque (2017)

Procesamiento del Lenguaje Natural, 58, 125-132. Consultado en http://ixa.si.ehu.es/sites/default/files/dokumentuak/8825/5421-4766-1-PB.pdf (ISSN edición impresa: 1135-5948) (ISSN edición electrónica: 1989-7553)

Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola

Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)

Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak

Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Steven Neale, Petya Osenova, Rita Pereira, Martin Popel, Joao Silva, Kiril Simov, Eneko Agirre

QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages (2016)

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), European Language Resources Association (ELRA). ISBN 978-2-9517408-9-1

Estarrona A., Aldezabal I., Díaz de Ilarraza A. eta Aranzabe M.J.

A Methodology for the Semiautomatic Annotation of EPEC-RolSem, a Basque Corpus Labeled at Predicate Level following the PropBank/Verbnet Model (2016)

Edward Vanhoutte (ed.) Digital Scholarship in the Humanities (2016) 31 (3): 470-492. DOI: http://dx.doi.org/10.1093/llc/fqv001 First published online: 17 June 2015 (23 pages). Published by Oxford University Press on behalf of EADH: The European Association for Digital Humanities (Online ISSN 2055-768X - Print ISSN 2055-7671)

A. Minard, M. Speranza, R. Urizar, B. Altuna, M. van Erp, A. Schoen, and C. van Son

MEANTIME, the NewsReader Multilingual Event and Time Corpus (2016)

Proceedings of LREC 2016.Pages: 4417-4422. ISBN: 978-2-9517408-9-1

Maria Jesús Aranzabe, Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Iakes Goenaga, Koldo Gojenola, Larraitz Uria

Automatic Conversion of the Basque Dependency Treebank to Universal Dependencies (2015)

Markus Dickinsons, Erhard Hinrichs, Agnieszka Patejuk, Adam Przepiórkowski (eds), Proceedings of the Fourteenth International Workshop on Treebanks an Linguistic Theories (TLT14), 233-241. Institute of Computer Science of the Polish Academy of Sciences, Warszawa, Poland. ISBN: 978-83-63159-18-4

Iruskieta M., Aranzabe M., Diaz de Ilarraza A., Gonzalez I., Lersundi I., Lopez de Lacalle O.

The RST Basque TreeBank: an online search interface to check rhetorical relations (2013)

4th​ Workshop RST and Discourse Studies, 40-49, Sociedad Brasileira de Computacao, Fortaleza, CE, Brasil. October 20-24 (http://encontrorst2013.wix.com/encontro-rst-2013)​

Pociello E., Agirre E. and Aldezabal I.

Methodology and construction of the Basque WordNet (2011)

Language Resources and Evaluation. Springer. Volume 45, Issue 2, pp 121-142. ISSN 1574-020X. DOI 10.1007/s10579-010-9131-y. official

Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Larraitz Uria

EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank (2010)

Lecture Notes in Computer Science (LNCS) nº 6008, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp.60-73, Springer. ISSN: 0302-9743, ISBN-10: 3-642-12115-2 Springer Berlin Heidelberg New York, ISBN-13: 978-3-642-12115-9 Springer Berlin Heidelberg New York. 11th International Conference, CICLing 2010, Iasi, Romania, March 21-27, 2010

Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona

Building the Basque PropBank (2010)

Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner and Daniel Tapias (eds.), Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), pp. 1414-1417, European Language Resources Association (ELRA), ISBN: 2-9517408-6-7. LREC 2010, Valletta (Malta), May 19-21, 2010

Uria L., Estarrona A., Aldezabal I., Aranzabe M., Díaz de Ilarraza A., Iruskieta M.

Evaluation of the Syntactic Annotation in EPEC, the Reference Corpus for the Processing of Basque (2009)

Lecture Notes in Computer Science (LNCS) nº 5449, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp 72-85. Springer. ISSN: 0302-9743, ISBN-10: 3-642-00381-8, ISBN-13: 978-3-642-00381-3. 10th International Conference, CICLing 2009, Mexico City, Mexico, March 1-7, 2009

Izaskun Aldezabal, Maria Jesús Aranzabe, Jose Maria Arriola, Arantza Diaz de Ilarraza

Syntactic annotation in the Reference Corpus for the Processing of Basque (EPEC): Theoretical and practical issues (2009)

Corpus Linguistics and Linguistic Theory 5-2 (2009), 241-269. Mouton de Gruyter. Berlin-New York. Print ISSN: 1613-7027 Online ISSN: 1613-7035

Izaskun Aldezabal, Klara Ceberio, Itsaso Esparza, Ainara Estarrona, Jone Etxeberria, Elixabete Izagirre, Mikel Iruskieta, Larraitz Uria

EPEC (Euskararen Prozesamendurako Erreferentzia Corpusa) segmentazio-mailan etiketatzeko eskuliburua (2007)

UPV/EHU / LSI / TR 11-2007

Itziar Aduriz, Maria Jesús Aranzabe, Jose Maria Arriola, Aitziber Atutxa, Arantza Diaz de Ilarraza, Nerea Ezeiza, Koldo Gojenola, Maite Oronoz, Aitor Soroa, Ruben Urizar

Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing (2006)

Corpus Linguistics Around the World. Book series: Language and Computers. Vol 56 (pag 1- 15). ISBN 90-420-1836-4 Ed. Andrew Wilson, Paul Rayson, and Dawn Archer. Rodopi. Netherlands.

Eneko Agirre, Izaskun Aldezabal, Jone Etxeberria, Mikel Iruskieta, Elixabete Izagirre, Karmele Mendizabal, Eli Pociello

Improving the Basque WordNet by corpus annotation. (2006)

Proceedings of Third International WordNet Conference. pp. 287-290. ISBN 80-210-3915-9. Jeju Island (Korea).

Izaskun Aldezabal, Olatz Ansa, Bertol Arrieta, Xabier Artola, Aitzol Ezeiza, Gregorio Hernández, Mikel Lersundi

EDBL: a General Lexical Basis for the Automatic Processing of Basque (2001)

IRCS Workshop on linguistic databases. Philadelphia (USA).

More publications

data_tabs_full

Konbitzul

Izen+aditz konbinazio-itzulpenen datu-basea

e-ROLda

A tool for looking up verb entries in the BVI lexicon and examples in EPEC-RolSem corpus

Universal Dependencies treebank for Basque

This treebank has 121 K words annotated following the guidelines proposed in the Universal Dependencies project.

 

Eusemcor

Corpus tagged with Basque WordNet senses.

Basque WordNet / Euskal WordNet

Basque WordNet

EDBL

Basque lexical database.

EPEC-ROLSEM

Corpus tagged with semantic roles.

EPEC-DEP (BDT)

A syntactic corpus tagged using the Dependency Grammar Theory.

Arantxa Otegi, Aitor Agirre, Jon Ander Campos, Aitor Soroa, Eneko Agirre

Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque (2020)

Proceedings of The 12th Language Resources and Evaluation Conference, pp. 429–435. European Language Resources Association. ISBN: 979-10-95546-34-4

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Towards Word Sense Disambiguation by Reasoning (2020)

Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340

Uxoa Iñurrieta

Identification and translation of verb+noun multiword expressions: a Spanish-Basque study (2020)

Procesamiento del Lenguaje Natural, 64, pp. 123-126.

Kepa Bengoetxea, Itziar Gonzalez-Dios, Amaia Aguirregoitia

AzterTest: Open source linguistic and stylistic analysis tool (2020)

Procesamiento del Lenguaje Natural, 64, 61-68. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6196

Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre

Give your Text Representation Models some Love: the Case for Basque (2020)

Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf

Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre

DoQA - Accessing Domain-Specific FAQs via Conversational QA (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7302–7314

Itziar Gonzalez-Dios, Javier Álvez, German Rigau

Towards a Model for Ontologising WordNet Adjectives (2020)

Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 1–6. ISBN: 979-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf

Jon Alkorta, Itziar Gonzalez-Dios

Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon (2020)

Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 20–24. ISBN: 79-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf

Thierry Declerck, Itziar Gonzalez-Dios, German Rigau (editors)

Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMWN-2020) (2020)

European Language Resources Association (ELRA), Paris. https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf ISBN: 979-10-95546-41-2 EAN: 9791095546412

Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza

EusTimeML: A mark-up language for temporal information in Basque (2020)

Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06

Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau

Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)

Language Resources and Evaluation Conference (LREC 2020)

Jon Alkorta, Koldo Gojenola, Mikel Iruskieta

SentiTegi: building a semantic oriented Basque lexicon (2019)

Computación y Sistemas, 22 (4)

Igone Zabala

The elaboration of Basque in academic and professional domains. (2019)

In Grenoble, Lenore; Lane, Pia & Røyneland, Unn Unn Røyneland (ed.) Linguistic Minorities in Europe Online. The Gruyter Mouton. ISSN 2510-5361

Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Mikel Iruskieta

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool (2019)

PLoS ONE 14(9): e0221639

Ander Soraluze, Olatz Arregi, Xabier Arregi, Arantza Diaz de Ilarraza

EUSKOR: End-to-end coreference resolution system for Basque (2019)

PLoS ONE 14(9): e0221801. https://doi.org/10.1371/journal.pone.0221801

Ainara Estarrona, Izaskun Etxeberria, Ander Soraluze, Manuel Padilla-Moyano

Spelling Normalisation of Basque Historical Texts (2019)

Procesamiento del Lenguaje Natural, vol. 63, pp. 59-66

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Commonsense Reasoning Using WordNet and SUMO: a Detailed Analysis (2019)

Proceedings of the Tenth Global Wordnet Conference, pp 197--205. ISBN 978-83-7493-108-3

ItziarGonzalez-Dios, German Rigau

Textual genre based approach to use wordnets in language-for-specific-purpose classroom as dictionary (2019)

Proceedings of the Tenth Global Wordnet Conference, pp 222--227. ISBN 978-83-7493-108-3

Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre

Conversational QA for FAQs (2019)

NeurIPS 3rd Conversational AI Workshop: “Today's Practice and Tomorrow's Potential”

Begoña Altuna, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza

Adapting TimeML to Basque: Event Annotation (2018)

In Gelbukh A. (eds.) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science (LNCS, vol 9624), 565-577. Springer, Cham. DOI https://doi.org/10.1007/978-3-319-75487-1_43; Print ISBN 978-3-319-75486-4; Online ISBN 978-3-319-75487-1

Uxoa Iñurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola

Konbitzul: an MWE-specific Database for Spanish-Basque (2018)

Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan. orrialdeak: pages 2500-2504.

Uxoa Iñurrieta, Itziar Aduriz, Ainara Estarrona, Itziar Gonzalez-Dios, Antton Gurrutxaga, Ruben Urizar, Iñaki Alegria

Verbal Multiword Expressions in Basque corpora (2018)

In the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (at COLING 2018)

Igone Zabala

Euskararen terminologiaren garapena Terminologiaren Teoria Komunikatiboaren argitan (2018)

In Ruben Urizar eta Itizar Aduriz (ed.) Hizkuntzalari Euskaldunen III Topaketa. Zer berri?. 349-358.

Klara Ceberio, Itziar Aduriz, Arantza Díaz de Ilarraza and Ines Garzia-Azkoaga

Coreferential Relations in Basque: The Annotation Process (2018)

J Psycholinguist Res (2018) 47, Issue 2. Pages 325-342. https://doi.org/10.1007/s10936-018-9559-6. ISSN 0090-6905. Online ISSN 1573-6555.

Izaskun Aldezabal, Xabier Artola, Arantza Diaz De Ilarraza, Itziar Gonzalez-Dios, Gorka Labaka, German Rigau and Ruben Urizar

Basque e-lexicographic resources: linguistic basis, development, and future perspectives (2018)
file2
(2018)

Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence. https://lexdhai.insight-centre.org/Lex_DH__AI_2018_paper_5.pdf

Ainara Estarrona, Izaskun Aldezabal, Arantza Díaz de Ilarraza

How the corpus-based Basque Verb Index lexicon was built (2018)

Language Resources and Evaluation. First Online 05 December 2018. DOI: https://doi.org/10.1007/s10579-018-9440-0. Springer Netherlands

Itziar Aduriz, María Jesús Aranzabe, José María Arriola, Arantza Díaz de Ilarraza, Itziar Gonzalez-Dios, Ruben Urizar

Building the Gold Standard for the Surface Syntax of Basque (2017)

Procesamiento del Lenguaje Natural, 58, 125-132. Consultado en http://ixa.si.ehu.es/sites/default/files/dokumentuak/8825/5421-4766-1-PB.pdf (ISSN edición impresa: 1135-5948) (ISSN edición electrónica: 1989-7553)

Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola

Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)

Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak

Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Steven Neale, Petya Osenova, Rita Pereira, Martin Popel, Joao Silva, Kiril Simov, Eneko Agirre

QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages (2016)

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), European Language Resources Association (ELRA). ISBN 978-2-9517408-9-1

Estarrona A., Aldezabal I., Díaz de Ilarraza A. eta Aranzabe M.J.

A Methodology for the Semiautomatic Annotation of EPEC-RolSem, a Basque Corpus Labeled at Predicate Level following the PropBank/Verbnet Model (2016)

Edward Vanhoutte (ed.) Digital Scholarship in the Humanities (2016) 31 (3): 470-492. DOI: http://dx.doi.org/10.1093/llc/fqv001 First published online: 17 June 2015 (23 pages). Published by Oxford University Press on behalf of EADH: The European Association for Digital Humanities (Online ISSN 2055-768X - Print ISSN 2055-7671)

A. Minard, M. Speranza, R. Urizar, B. Altuna, M. van Erp, A. Schoen, and C. van Son

MEANTIME, the NewsReader Multilingual Event and Time Corpus (2016)

Proceedings of LREC 2016.Pages: 4417-4422. ISBN: 978-2-9517408-9-1

Maria Jesús Aranzabe, Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Iakes Goenaga, Koldo Gojenola, Larraitz Uria

Automatic Conversion of the Basque Dependency Treebank to Universal Dependencies (2015)

Markus Dickinsons, Erhard Hinrichs, Agnieszka Patejuk, Adam Przepiórkowski (eds), Proceedings of the Fourteenth International Workshop on Treebanks an Linguistic Theories (TLT14), 233-241. Institute of Computer Science of the Polish Academy of Sciences, Warszawa, Poland. ISBN: 978-83-63159-18-4

Iruskieta M., Aranzabe M., Diaz de Ilarraza A., Gonzalez I., Lersundi I., Lopez de Lacalle O.

The RST Basque TreeBank: an online search interface to check rhetorical relations (2013)

4th​ Workshop RST and Discourse Studies, 40-49, Sociedad Brasileira de Computacao, Fortaleza, CE, Brasil. October 20-24 (http://encontrorst2013.wix.com/encontro-rst-2013)​

Pociello E., Agirre E. and Aldezabal I.

Methodology and construction of the Basque WordNet (2011)

Language Resources and Evaluation. Springer. Volume 45, Issue 2, pp 121-142. ISSN 1574-020X. DOI 10.1007/s10579-010-9131-y. official

Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Larraitz Uria

EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank (2010)

Lecture Notes in Computer Science (LNCS) nº 6008, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp.60-73, Springer. ISSN: 0302-9743, ISBN-10: 3-642-12115-2 Springer Berlin Heidelberg New York, ISBN-13: 978-3-642-12115-9 Springer Berlin Heidelberg New York. 11th International Conference, CICLing 2010, Iasi, Romania, March 21-27, 2010

Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona

Building the Basque PropBank (2010)

Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner and Daniel Tapias (eds.), Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), pp. 1414-1417, European Language Resources Association (ELRA), ISBN: 2-9517408-6-7. LREC 2010, Valletta (Malta), May 19-21, 2010

Uria L., Estarrona A., Aldezabal I., Aranzabe M., Díaz de Ilarraza A., Iruskieta M.

Evaluation of the Syntactic Annotation in EPEC, the Reference Corpus for the Processing of Basque (2009)

Lecture Notes in Computer Science (LNCS) nº 5449, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp 72-85. Springer. ISSN: 0302-9743, ISBN-10: 3-642-00381-8, ISBN-13: 978-3-642-00381-3. 10th International Conference, CICLing 2009, Mexico City, Mexico, March 1-7, 2009

Izaskun Aldezabal, Maria Jesús Aranzabe, Jose Maria Arriola, Arantza Diaz de Ilarraza

Syntactic annotation in the Reference Corpus for the Processing of Basque (EPEC): Theoretical and practical issues (2009)

Corpus Linguistics and Linguistic Theory 5-2 (2009), 241-269. Mouton de Gruyter. Berlin-New York. Print ISSN: 1613-7027 Online ISSN: 1613-7035

Izaskun Aldezabal, Klara Ceberio, Itsaso Esparza, Ainara Estarrona, Jone Etxeberria, Elixabete Izagirre, Mikel Iruskieta, Larraitz Uria

EPEC (Euskararen Prozesamendurako Erreferentzia Corpusa) segmentazio-mailan etiketatzeko eskuliburua (2007)

UPV/EHU / LSI / TR 11-2007

Itziar Aduriz, Maria Jesús Aranzabe, Jose Maria Arriola, Aitziber Atutxa, Arantza Diaz de Ilarraza, Nerea Ezeiza, Koldo Gojenola, Maite Oronoz, Aitor Soroa, Ruben Urizar

Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing (2006)

Corpus Linguistics Around the World. Book series: Language and Computers. Vol 56 (pag 1- 15). ISBN 90-420-1836-4 Ed. Andrew Wilson, Paul Rayson, and Dawn Archer. Rodopi. Netherlands.

Eneko Agirre, Izaskun Aldezabal, Jone Etxeberria, Mikel Iruskieta, Elixabete Izagirre, Karmele Mendizabal, Eli Pociello

Improving the Basque WordNet by corpus annotation. (2006)

Proceedings of Third International WordNet Conference. pp. 287-290. ISBN 80-210-3915-9. Jeju Island (Korea).

Izaskun Aldezabal, Olatz Ansa, Bertol Arrieta, Xabier Artola, Aitzol Ezeiza, Gregorio Hernández, Mikel Lersundi

EDBL: a General Lexical Basis for the Automatic Processing of Basque (2001)

IRCS Workshop on linguistic databases. Philadelphia (USA).

More publications