Information Extraction and Information Retrieval
The ever increasing availability of unstructured textual resources in the Web and their potential to be used in applications for the automatic acquisition of knowledge have caused a dramatic rise in research related to Information Extraction (IE) and Information Retrieval (IR). Traditionally, the required textual content was produced by means of manual annotations by human experts on the task at hand, which is too costly in terms of both economic and human resources. In the last decade, new t...Read More
ie_ir_tabs
Demos
Demo of the NewsReader NLP pipeline
Just copy in any English text and see what entities and events and other annotations are added automatically. The result is represented in the NAF format.
Demo of the NewsReader NLP pipeline
Just copy in any Spanish text and see what entities and other annotations are added automatically. The result is represented in the NAF format
Eihera
Basque named entities recognizer/classifier
Eustagger
Basque lemmatizer and morphosyntactic analyzer
Contracts
Projects
DeepKnowledge (PID2021-127777OB-C21) project funded by MCIN/AEI/10.13039/501100011033 and by "ERDF A way of making Europe"
(2022 - 2025)- IC4LANG: Aprendizaje En contexto como nuevo paradigma para investigar tecnologías del lenguaje escalables y de alta precisión adaptadas a las necesidades industriales del País Vasco
(2023 - 2025)
Antidote (PCI2020-120717-2) funded by MCIN/AEI /10.13039/501100011033 and by European Union NextGenerationEU/PRTR
(2021 - 2024)- Disargue: Few-shot Learning and Argumentation to Detect and Fight Misinformation in Social Media
Disargue (TED2021-130810B-C21) funded by MCIN/AEI /10.13039/501100011033 and by European Union NextGenerationEU/ PRTR
(2022 - 2024)
Better Extraction from Text Towards Enhanced Retrieval
(2019 - 2023)
Tools for the analysis of parliamentary discourses: polarization, subjectivity and affectivity in the post-truth era
(2020 - 2022)
DeepReading: Mining, Understanding, and Reasoning with Multilingual Content.
(2019 - 2021)
Deep learning, Big Data and knowledge for multilingual text processing.
(2019 - 2021)
New generation of neural artificial intelligence models to transform language technologies in the Basque Country's industry.
(2020 - 2021)
Automated surveillance of key questions on COVID-19 in scientific publications
(2020 - 2021)
Learning to Interact with Humans by Lifelong Interaction with Humans
(2017 - 2020)- CROSSTEXT: Automatic Generation of Multilingual Semantic Processors
Automatic generation of multilingual semantic taggers
(2017 - 2019)
TUNER: Automatic domain adaptation for semantic processing.
(2016 - 2018)- MUSTER: Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.
Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.
(2016 - 2018) - Openminted: Sharing IXA pipes in the OpenMinTeD platform.
Openminted: Sharing IXA pipes in the OpenMinTeD platform.
(2018 - 2018) All HiTZ projects
Patents
Resources
- EIEC
Basque Named Entity Recognition corpus. - EDIEC
Basque corpus annotated for Named Entity Disambiguation. - MCR: Multilingual Central Repository
Multilingual lexical database with wordnets for several European languages, including Basque. - EPEC-EuSemcor
Corpus tagged with Basque WordNet senses.
Publications
Ander Salaberria, Gorka Azkune, Oier Lopez de Lacalle, Aitor Soroa, Eneko Agirre
Image captioning for effective use of language models in knowledge-based visual question answering (2023)
Expert Systems with Applications, 2023, vol. 212, p. 118669. Preprint: https://arxiv.org/abs/2109.08029
Nayla Escribano, German Rigau, Rodrigo Agerri
A modular approach for multilingual timex detection and normalization using deep learning and grammar-based methods (2023)
Nayla Escribano, German Rigau, Rodrigo Agerri, A modular approach for multilingual timex detection and normalization using deep learning and grammar-based methods, Knowledge-Based Systems, Volume 273, 2023, 110612, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2023.110612. (https://www.sciencedirect.com/science/article/pii/S0950705123003623) Abstract: Detecting and normalizing temporal expressions is an essential step for many NLP tasks. While a variety of methods have been proposed for detection, best normalization approaches rely on hand-crafted rules. Furthermore, most of them have been designed only for English. In this paper we present a modular multilingual temporal processing system combining a fine-tuned Masked Language Model for detection, and a grammar-based normalizer. We experiment in Spanish and English and compare with HeidelTime, the state-of-the-art in multilingual temporal processing. We obtain best results in gold timex normalization, timex detection and type recognition, and competitive performance in the combined TempEval-3 relaxed value metric. A detailed error analysis shows that detecting only those timexes for which it is feasible to provide a normalization is highly beneficial in this last metric. This raises the question of which is the best strategy for timex processing, namely, leaving undetected those timexes for which is not easy to provide normalization rules or aiming for high coverage. Keywords: Temporal processing; Multilingualism; Sequence labeling; Grammar-based approaches; Deep learning; Natural language processing
Murali Kondragunta, Olatz Perez-de-Viñaspre, Maite Oronoz
Improving and Simplifying Template-Based Named Entity Recognition (2023)
In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 79–86, Dubrovnik, Croatia. Association for Computational Linguistics. May 2023, Dubrovnik, Croatia.
Rodrigo Agerri, Eneko Aigrre
Lessons learned from the evaluation of Spanish Language Models (2023)
Procesamiento del Lenguaje Natural (70), pp 157-170
Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri, Aitor Soroa
Scaling Laws for BERT in Low-Resource Settings (2023)
Findings of the Association for Computational Linguistics: ACL 2023
Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, Dan Roth
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey (2023)
ACM Computing Surveys. 27 June 2023
Jeremy Barnes, Samia Touileb, Petter Mæhlum, Pierre Lison
Identifying Token-Level Dialectal Features in Social Media (2023)
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Arantxa Otegi, Iñaki San Vicente, Xabier Saralegi, Anselmo Peñas, Borja Lozano, Eneko Agirre
Information retrieval and question answering: A case study on COVID-19 scientific literature (2022)
Knowledge-Based Systems, Volume 240.
Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, Eneko Agirre
Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning (2022)
In Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, Washington. Association for Computational Linguistics.
Oscar Sainz, Haoling Qiu, Oier Lopez de Lacalle, Eneko Agirre, Bonan Min
ZS4IE: A toolkit for Zero-Shot Information Extraction with simple Verbalizations (2022)
In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, Seattle, Washington. Association for Computational Linguistics.
Eneko Agirre
Few-shot Information Extraction is Here: Pre-train, Prompt and Entail (2022)
In Few-shot Information Extraction is Here: Pre-train, Prompt and Entail
E Agirre, M Apidianaki, I Vulić
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2022)
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, Dublin, Ireland
David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, and Erik Velldal
Direct Parsing to Sentiment Graphs (2022)
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages: 470–478
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri
BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions (2022)
Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3382–3390, Marseille, France. European Language Resources Association.
Iker Garcia-Ferrero, Rodrigo Agerri, German Rigau
Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings (2022)
Findings of the Association for Computational Linguistics: EMNLP 2022
Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal
SemEval 2022 Task 10: Structured Sentiment Analysis (2022)
In SemEval 2022
Blanca Calvo Figueras, Montse Cuadros, Rodrigo Agerri
A Semantics-Aware Approach to Automated Claim Verification (2022)
In Proceedings of the Fifth Fact Extraction and VERification Workshop (FEVER), pages 37–48, Dublin, Ireland. Association for Computational Linguistics
Cristina Aceta, Johan Kildal, Izaskun Fernández, Aitor Soroa
Towards an optimal design of natural human interaction mechanisms for a service robot with ancillary way-finding capabilities in industrial environments (2021)
Production & Manufacturing Research, 9:1, 1-32
Ainhoa Serna, Aitor Soroa, Rodrigo Agerri
Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport (2021)
Sustainability 13, no. 4: 2397.
Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre
Inferring spatial relations from textual descriptions of images (2021)
Pattern Recognition, Volume 113, 107847. Pre-print: https://arxiv.org/abs/2102.00997
Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)
Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2021)
In conjunction with NAACL. Association for Computational Linguistics
Elena Zotova, Rodrigo Agerri, German Rigau
Semi-automatic generation of multilingual datasets for stance detection in Twitter (2021)
Expert Systems with Applications, 170 (2021).
Joseba Fernandez de Landa, Rodrigo Agerri
Euskarazko on-line artikuluetan aipatutako izendun entitate nabarmenen identifikazioa denbora errealean (2021)
Ekaia
Jon Alkorta
Hacia el análisis de sentimientos en euskera (2021)
J. Alkorta. (2021). Hacia el análisis de sentimientos en euskera. Procesamiento del Lenguaje Natural, 66, 201-204.
Joseba Fernandez de Landa, Iker García, Ander Salaberria, Jon Ander Campos
Twitterreko Euskal Komunitatearen Eduki Azterketa Pandemia Garaian (2021)
IV. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Ingeniaritza eta Arkitektura
Ander Barrena, Aitor Soroa, Eneko Agirre
Towards Zero-Shot Cross-Lingual Named Entity Disambiguation (2021)
Expert Systems With Applications ESWA 2021
Oscar Sainz, Oier Lopez de Lacalle, Gorka Labaka, Ander Barrena, Eneko Agirre
Label Verbalization and Entailment for Effective Zero- and Few-Shot Relation Extraction (2021)
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Rodrigo Agerri, Roberto Centeno, María Espinosa, Joseba Fernández de Landa, Álvaro Rodrigo
VaxxStance@IberLEF 2021: Overview of the Task on Going Beyond Text in Cross-Lingual Stance Detection (2021)
Procesamiento del Lenguaje Natural, 67, pp 173-181
Iker García-Ferrero, Rodrigo Agerri, German Rigau
Benchmarking Meta-embeddings: What Works and What Does Not (2021)
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021
Yi-Ling Chung, Marco Guerini, Rodrigo Agerri
Multilingual Counter Narrative Type Classification (2021)
Proceedings of Argument Mining 2021
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project: European Clinical Case Corpus (2021)
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2021). Pages 17-20. ISSN: 1613-0073. URL: http://ceur-ws.org/Vol-2968/paper5.pdf
Eneko Agirre
Cross-Lingual Word Embeddings (Book Review) (2020)
Computational Linguistics 46 (1), 245-248. (https://doi.org/10.1162/COLI_r_00372)
Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune and Eneko Agirre
Evaluating Multimodal Representations on Visual Semantic Textual Similarity (2020)
Proceedings of the Twenty-third European Conference on Artificial Intelligence, ECAI 2020, June 8-12, 2020, Santiago Compostela, Spain
Oscar Sainz, Oier Lopez de Lacalle, Itziar Aldabe, Montse Maritxalar
Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Towards Word Sense Disambiguation by Reasoning (2020)
Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340
Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre
Give your Text Representation Models some Love: the Case for Basque (2020)
Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf
Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza
EusTimeML: A mark-up language for temporal information in Basque (2020)
Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06
Rodrigo Agerri, German Rigau
Language independent sequence labelling for Opinion Target Extraction (2020)
International Joint Conference on Artificial Intelligence (IJCAI 2020)
Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau
Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)
Language Resources and Evaluation Conference (LREC 2020)
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning (2020)file2 (2020)
Frontiers in Artificial Intelligence and Applications. Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang (eds.). Volume 325: ECAI 2020. Pages 585 - 592. IOS Press Ebooks
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana Garcia-Serrano, Mohamed Ben Aouicha, Eneko Agirre, David Sánchez
A large reproducible benchmark of ontology-based methods and word embeddings for word similarity (2020)
Information Systems. Online first.
Iker de la Iglesia, Mikel Martinez-Puente, Alexander Platas, Iria San Miguel, Aitziber Atutxa, Koldo Gojenola
MEDIA team at the CLEF-2020 MultilingualInformation Extraction Task (2020)
Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum Thessaloniki, Greece, September 22-25, 2020.
Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2020)
In conjunction with EMNLP. Association for Computational Linguistics
Rodrigo Agerri, German Rigau
Projecting Heterogeneous Annotations for Named Entity Recognition (2020)
In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). Winner of the
CAPITEL@IberLEFtask on Spanish NER.
María Espinosa, Rodrigo Agerri, Roberto Centeno, Alvaro Rodrigo
DeepReading@SardiStance:Combining Textual, Social and Emotional Features. (2020)
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Winners of the
SardiStance@Evalita2020 shared task
Rodrigo Agerri, German Rigau
Language independent sequence labelling for Opinion Target Extraction (2019)
Artificial Intelligence, 268 (2019) 85-95
lñigo Lopez-Gazpio, Montse Maritxalar, Mirella Lapata, Eneko Agirre
Word n-gram attention models for sentence similarity and inference (2019)
Expert Systems with Applications. Volume 132, 15 October 2019, Pages 1-11. https://doi.org/10.1016/j.eswa.2019.04.054.
Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa and Eneko Agirre
Analyzing the Limitations of Cross-lingual Word Embedding Mappings (2019)
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4990-4995.
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre
Reproducibility dataset for a large experimental survey on word embeddings and ontology-based methods for word similarity (2019)
Data in Brief, Volume 26.
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre
A reproducible survey on word embeddings and ontology-based methods for word similarity: linear combinations outperform the state of the art (2019)
Engineering Applications of Artificial Intelligence. Volume 85, October 2019, Pages 645-665.
Andrea Amelio Ravelli, Oier Lopez de Lacalle, Eneko Agirre
A comparison of representation models in a non-conventional semantic similarity scenario (2019)
Proceedings of the Sixth Italian Conference on Computational Linguistics, Bari, Italy.
Rodrigo Agerri
Doris Martin at SemEval-2019 Task 4: Hyperpartisan News Detection with Generic Semi-supervised Features (2019)
SemEval@NAACL-HLT2019: 944-948 https://www.aclweb.org/anthology/S19-2161.pdf
Joseba Fernandez de Landa, Rodrigo Agerri, Iñaki Alegria
Euskaldun gazte eta helduen harremanak Twitterren (2019)
III. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Gizarte Zientziak eta Zuzenbidea. 2, pp. 83 - 90
Itziar Gonzalez-Dios, Javier Alvez, and German Rigau
Exploiting Metonymy from Available Knowledge Resources. (2019)
20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019). (TO APPEAR in LNCS)
Javier Álvez, Montserrat Hermo, Paqui Lucio, German Rigau
Automatic white-box testing of first-order logic ontologies (2019)
Journal of Logic and Computation, Volume 29, Issue 5, September 2019, Pages 723–751
Alvez,J; Lucio,P; Rigau,G
A Framework for the Evaluation of SUMO-Based Ontologies Using WordNet (2019)
IEEE Access, 7, 36075-36093. 2019
Mark Stevenson, Eneko Agirre
Word Sense Disambiguation (2018)
The Oxford Handbook of Computational Linguistics 2nd edition (2 ed.) Edited by Ruslan Mitkov. Oxford. ISBN: 9780199573691. DOI of the chapter: 10.1093/oxfordhb/9780199573691.013.28
Josu Goikoetxea, Aitor Soroa eta Eneko Agirre
Knowledge-Based Systems (KNOSYS). Volume 150, 15 June 2018, Pages 218-230. ISSN: 0950-7051. DOI https://doi.org/10.1016/j.knosys.2018.03.017 Preprint at https://arxiv.org/pdf/1804.08316.pdf
Rodrigo Agerri, Yiling Chung, Itziar Aldabe, Nora Aranberri, Gorka Labaka, German Rigau
Building Named Entity Recognition Taggers via Parallel Corpora (2018)
In Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), 7-12 May, 2018, Miyazaki, Japan.
Ander Barrena, Aitor Soroa, Eneko Agirre
Learning text representations for 500K classification tasks on Named Entity Disambiguation (2018)
The SIGNLL Conference on Computational Natural Language Learning CONLL 2018
Rodrigo Agerri, German Rigau
Simple Language Independent Sequence Labelling for the Annotation of Disabilities in Medical Texts (2018)
Proceedings of the Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Diann Track, Sevilla, Spain.
Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau
Multi-lingual and Cross-lingual timeline extraction (2017)
Knowledge-Based Systems, 133, 77-89
Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola
Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)
Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak
Goikoetxea J., Agirre E., Soroa A.
Single or Multiple. Combining Word Representations Independently Learned from Text and WordNet (2016)
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. pp. 2608-26014. ISBN: 978-1-57735-760-5. Phoenix (USA).
Rodrigo Agerri, German Rigau
Robust Multilingual Named Entity Recognition with Shallow Semi-supervised Features (2016)
Artificial Intelligence, 238 (2016) pages 63-82. http://dx.doi.org/10.1016/j.artint.2016.05.003
Hugo Manguinhas, Nuno Freire, Antoine Isaac, Juliane Stiller, Valentine Charles, Aitor Soroa, Rainer Simon, Vladimir Alexiev
Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata (2016)
Proceedings of the 20th International Conference on Theory and Practice of Digital Libraries, TPDL 2016, Vol 9818, pp 266-278
Ander Intxaurrondo, Eneko Agirre, Oier Lopez de Lacalle, Mihai Surdeanu
Diamonds in the Rough: Event Extraction from Imperfect Microblog Data (2015)file2 (2015)file3 (2015)
Proceedings of the North American chapter of the Association for Computational Linguistics (NAACL HLT), pages: 641-650. ISBN: 978-1-941643-49-5.
Goikoetxea J., Agirre E., Soroa A.
Random Walks and Neural Network Language Models on Knowledge Bases (2015)
Proceedings of the Annual Meeting of the North American chapter of the Association of Computational Linguistics (NAACL HLT 2015), pages 1434-1439. ISBN: 978-1-937284-73-2. Denver (USA).
Igor Leturia, Kepa Sarasola, Xabier Arregi, Arantza Diaz de Ilarraza, Eva Navas, Iñaki Sainz, Arantza del Pozo, David Baranda, Urtza Iturraspe
BerbaTek: euskararako hizkuntza teknologien garapena itzulpengintza, edukien kudeaketa eta irakaskuntza arloetan (2013)
Euskalingua aldizkari digitala, 23, 66-76. http://mendebalde.eus/euskalinguak/Euskalingua%2023/Berbatek:%20euskararako%20hizkuntza%20teknologien%20garapena%20itzulpengintza,%20edukien%20kudeaketa%20eta%20irakaskuntza%20arloetan.pdf
Mark Hall, Eneko Agirre, Nikolas Aletras, Runar Bergheim, Kostas Chandrinos, Paul Clough, Samuel Fernando, Kate Fernie, Paula Goodale, Jill Griffiths, Oier Lopez de Lacalle, Andrea de Polo, Aitor Soroa, Mark Stevenson
PATHS - Exploring Digital Cultural Heritage Spaces (2012)
Theory and Practice of Digital Libraries 2012. ISBN 9783642332906 ISSN 0302-9743
Arantxa Otegi
Hedapena informazioaren berreskurapenean: hitzen adiera-desanbiguazioaren eta antzekotasun semantikoaren ekarpenak (2012)file2 (2012)
Lengoaia eta Sistema Informatikoak Saila, EHU/UPV. Informatika Fakultatea. 2012/03/16
Iñaki Alegria, Bertol Arrieta, Arantza Diaz de Ilarraza, Elixabete Izagirre, Montse Maritxalar
Using Machine Learning Techniques to Build a Comma Checker for Basque (2006)
Proceedings of Coling-ACL 2006. Sydney. Australia.ISBN: 1-932432-69-8 pp.1-8. https://aclanthology.org/P06-4000/
A. Casillas, V. Fresno, R. Martínez, S. Montalvo
Evaluación del clustering de páginas web mediante funciones de peso y combinación heurística de criterios (2005)
Revista Española para el Procesamiento del Lenguaje Natural, 35, 417-424 .https://1library.co/document/yn4mkjpz-evaluacion-clustering-paginas-mediante-funciones-combinacion-heuristica-criterios.html
ie_ir_tabs_full
Demo of the NewsReader NLP pipeline
Just copy in any English text and see what entities and events and other annotations are added automatically. The result is represented in the NAF format.
Demo of the NewsReader NLP pipeline
Just copy in any Spanish text and see what entities and other annotations are added automatically. The result is represented in the NAF format
Eihera
Basque named entities recognizer/classifier
Eustagger
Basque lemmatizer and morphosyntactic analyzer
DeepKnowledge (PID2021-127777OB-C21) project funded by MCIN/AEI/10.13039/501100011033 and by "ERDF A way of making Europe"
(2022 - 2025)- IC4LANG: Aprendizaje En contexto como nuevo paradigma para investigar tecnologías del lenguaje escalables y de alta precisión adaptadas a las necesidades industriales del País Vasco
(2023 - 2025)
Antidote (PCI2020-120717-2) funded by MCIN/AEI /10.13039/501100011033 and by European Union NextGenerationEU/PRTR
(2021 - 2024)- Disargue: Few-shot Learning and Argumentation to Detect and Fight Misinformation in Social Media
Disargue (TED2021-130810B-C21) funded by MCIN/AEI /10.13039/501100011033 and by European Union NextGenerationEU/ PRTR
(2022 - 2024)
Better Extraction from Text Towards Enhanced Retrieval
(2019 - 2023)
Tools for the analysis of parliamentary discourses: polarization, subjectivity and affectivity in the post-truth era
(2020 - 2022)
DeepReading: Mining, Understanding, and Reasoning with Multilingual Content.
(2019 - 2021)
Deep learning, Big Data and knowledge for multilingual text processing.
(2019 - 2021)
New generation of neural artificial intelligence models to transform language technologies in the Basque Country's industry.
(2020 - 2021)
Automated surveillance of key questions on COVID-19 in scientific publications
(2020 - 2021)
Learning to Interact with Humans by Lifelong Interaction with Humans
(2017 - 2020)- CROSSTEXT: Automatic Generation of Multilingual Semantic Processors
Automatic generation of multilingual semantic taggers
(2017 - 2019)
TUNER: Automatic domain adaptation for semantic processing.
(2016 - 2018)- MUSTER: Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.
Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.
(2016 - 2018) - Openminted: Sharing IXA pipes in the OpenMinTeD platform.
Openminted: Sharing IXA pipes in the OpenMinTeD platform.
(2018 - 2018) All HiTZ projects
- EIEC
Basque Named Entity Recognition corpus. - EDIEC
Basque corpus annotated for Named Entity Disambiguation. - MCR: Multilingual Central Repository
Multilingual lexical database with wordnets for several European languages, including Basque. - EPEC-EuSemcor
Corpus tagged with Basque WordNet senses.
Ander Salaberria, Gorka Azkune, Oier Lopez de Lacalle, Aitor Soroa, Eneko Agirre
Image captioning for effective use of language models in knowledge-based visual question answering (2023)
Expert Systems with Applications, 2023, vol. 212, p. 118669. Preprint: https://arxiv.org/abs/2109.08029
Nayla Escribano, German Rigau, Rodrigo Agerri
A modular approach for multilingual timex detection and normalization using deep learning and grammar-based methods (2023)
Nayla Escribano, German Rigau, Rodrigo Agerri, A modular approach for multilingual timex detection and normalization using deep learning and grammar-based methods, Knowledge-Based Systems, Volume 273, 2023, 110612, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2023.110612. (https://www.sciencedirect.com/science/article/pii/S0950705123003623) Abstract: Detecting and normalizing temporal expressions is an essential step for many NLP tasks. While a variety of methods have been proposed for detection, best normalization approaches rely on hand-crafted rules. Furthermore, most of them have been designed only for English. In this paper we present a modular multilingual temporal processing system combining a fine-tuned Masked Language Model for detection, and a grammar-based normalizer. We experiment in Spanish and English and compare with HeidelTime, the state-of-the-art in multilingual temporal processing. We obtain best results in gold timex normalization, timex detection and type recognition, and competitive performance in the combined TempEval-3 relaxed value metric. A detailed error analysis shows that detecting only those timexes for which it is feasible to provide a normalization is highly beneficial in this last metric. This raises the question of which is the best strategy for timex processing, namely, leaving undetected those timexes for which is not easy to provide normalization rules or aiming for high coverage. Keywords: Temporal processing; Multilingualism; Sequence labeling; Grammar-based approaches; Deep learning; Natural language processing
Murali Kondragunta, Olatz Perez-de-Viñaspre, Maite Oronoz
Improving and Simplifying Template-Based Named Entity Recognition (2023)
In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 79–86, Dubrovnik, Croatia. Association for Computational Linguistics. May 2023, Dubrovnik, Croatia.
Rodrigo Agerri, Eneko Aigrre
Lessons learned from the evaluation of Spanish Language Models (2023)
Procesamiento del Lenguaje Natural (70), pp 157-170
Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri, Aitor Soroa
Scaling Laws for BERT in Low-Resource Settings (2023)
Findings of the Association for Computational Linguistics: ACL 2023
Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, Dan Roth
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey (2023)
ACM Computing Surveys. 27 June 2023
Jeremy Barnes, Samia Touileb, Petter Mæhlum, Pierre Lison
Identifying Token-Level Dialectal Features in Social Media (2023)
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Arantxa Otegi, Iñaki San Vicente, Xabier Saralegi, Anselmo Peñas, Borja Lozano, Eneko Agirre
Information retrieval and question answering: A case study on COVID-19 scientific literature (2022)
Knowledge-Based Systems, Volume 240.
Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, Eneko Agirre
Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning (2022)
In Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, Washington. Association for Computational Linguistics.
Oscar Sainz, Haoling Qiu, Oier Lopez de Lacalle, Eneko Agirre, Bonan Min
ZS4IE: A toolkit for Zero-Shot Information Extraction with simple Verbalizations (2022)
In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, Seattle, Washington. Association for Computational Linguistics.
Eneko Agirre
Few-shot Information Extraction is Here: Pre-train, Prompt and Entail (2022)
In Few-shot Information Extraction is Here: Pre-train, Prompt and Entail
E Agirre, M Apidianaki, I Vulić
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2022)
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, Dublin, Ireland
David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, and Erik Velldal
Direct Parsing to Sentiment Graphs (2022)
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages: 470–478
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri
BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions (2022)
Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3382–3390, Marseille, France. European Language Resources Association.
Iker Garcia-Ferrero, Rodrigo Agerri, German Rigau
Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings (2022)
Findings of the Association for Computational Linguistics: EMNLP 2022
Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal
SemEval 2022 Task 10: Structured Sentiment Analysis (2022)
In SemEval 2022
Blanca Calvo Figueras, Montse Cuadros, Rodrigo Agerri
A Semantics-Aware Approach to Automated Claim Verification (2022)
In Proceedings of the Fifth Fact Extraction and VERification Workshop (FEVER), pages 37–48, Dublin, Ireland. Association for Computational Linguistics
Cristina Aceta, Johan Kildal, Izaskun Fernández, Aitor Soroa
Towards an optimal design of natural human interaction mechanisms for a service robot with ancillary way-finding capabilities in industrial environments (2021)
Production & Manufacturing Research, 9:1, 1-32
Ainhoa Serna, Aitor Soroa, Rodrigo Agerri
Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport (2021)
Sustainability 13, no. 4: 2397.
Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre
Inferring spatial relations from textual descriptions of images (2021)
Pattern Recognition, Volume 113, 107847. Pre-print: https://arxiv.org/abs/2102.00997
Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)
Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2021)
In conjunction with NAACL. Association for Computational Linguistics
Elena Zotova, Rodrigo Agerri, German Rigau
Semi-automatic generation of multilingual datasets for stance detection in Twitter (2021)
Expert Systems with Applications, 170 (2021).
Joseba Fernandez de Landa, Rodrigo Agerri
Euskarazko on-line artikuluetan aipatutako izendun entitate nabarmenen identifikazioa denbora errealean (2021)
Ekaia
Jon Alkorta
Hacia el análisis de sentimientos en euskera (2021)
J. Alkorta. (2021). Hacia el análisis de sentimientos en euskera. Procesamiento del Lenguaje Natural, 66, 201-204.
Joseba Fernandez de Landa, Iker García, Ander Salaberria, Jon Ander Campos
Twitterreko Euskal Komunitatearen Eduki Azterketa Pandemia Garaian (2021)
IV. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Ingeniaritza eta Arkitektura
Ander Barrena, Aitor Soroa, Eneko Agirre
Towards Zero-Shot Cross-Lingual Named Entity Disambiguation (2021)
Expert Systems With Applications ESWA 2021
Oscar Sainz, Oier Lopez de Lacalle, Gorka Labaka, Ander Barrena, Eneko Agirre
Label Verbalization and Entailment for Effective Zero- and Few-Shot Relation Extraction (2021)
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Rodrigo Agerri, Roberto Centeno, María Espinosa, Joseba Fernández de Landa, Álvaro Rodrigo
VaxxStance@IberLEF 2021: Overview of the Task on Going Beyond Text in Cross-Lingual Stance Detection (2021)
Procesamiento del Lenguaje Natural, 67, pp 173-181
Iker García-Ferrero, Rodrigo Agerri, German Rigau
Benchmarking Meta-embeddings: What Works and What Does Not (2021)
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021
Yi-Ling Chung, Marco Guerini, Rodrigo Agerri
Multilingual Counter Narrative Type Classification (2021)
Proceedings of Argument Mining 2021
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project: European Clinical Case Corpus (2021)
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2021). Pages 17-20. ISSN: 1613-0073. URL: http://ceur-ws.org/Vol-2968/paper5.pdf
Eneko Agirre
Cross-Lingual Word Embeddings (Book Review) (2020)
Computational Linguistics 46 (1), 245-248. (https://doi.org/10.1162/COLI_r_00372)
Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune and Eneko Agirre
Evaluating Multimodal Representations on Visual Semantic Textual Similarity (2020)
Proceedings of the Twenty-third European Conference on Artificial Intelligence, ECAI 2020, June 8-12, 2020, Santiago Compostela, Spain
Oscar Sainz, Oier Lopez de Lacalle, Itziar Aldabe, Montse Maritxalar
Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Towards Word Sense Disambiguation by Reasoning (2020)
Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340
Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre
Give your Text Representation Models some Love: the Case for Basque (2020)
Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf
Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza
EusTimeML: A mark-up language for temporal information in Basque (2020)
Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06
Rodrigo Agerri, German Rigau
Language independent sequence labelling for Opinion Target Extraction (2020)
International Joint Conference on Artificial Intelligence (IJCAI 2020)
Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau
Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)
Language Resources and Evaluation Conference (LREC 2020)
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning (2020)file2 (2020)
Frontiers in Artificial Intelligence and Applications. Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang (eds.). Volume 325: ECAI 2020. Pages 585 - 592. IOS Press Ebooks
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana Garcia-Serrano, Mohamed Ben Aouicha, Eneko Agirre, David Sánchez
A large reproducible benchmark of ontology-based methods and word embeddings for word similarity (2020)
Information Systems. Online first.
Iker de la Iglesia, Mikel Martinez-Puente, Alexander Platas, Iria San Miguel, Aitziber Atutxa, Koldo Gojenola
MEDIA team at the CLEF-2020 MultilingualInformation Extraction Task (2020)
Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum Thessaloniki, Greece, September 22-25, 2020.
Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2020)
In conjunction with EMNLP. Association for Computational Linguistics
Rodrigo Agerri, German Rigau
Projecting Heterogeneous Annotations for Named Entity Recognition (2020)
In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). Winner of the
CAPITEL@IberLEFtask on Spanish NER.
María Espinosa, Rodrigo Agerri, Roberto Centeno, Alvaro Rodrigo
DeepReading@SardiStance:Combining Textual, Social and Emotional Features. (2020)
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Winners of the
SardiStance@Evalita2020 shared task
Rodrigo Agerri, German Rigau
Language independent sequence labelling for Opinion Target Extraction (2019)
Artificial Intelligence, 268 (2019) 85-95
lñigo Lopez-Gazpio, Montse Maritxalar, Mirella Lapata, Eneko Agirre
Word n-gram attention models for sentence similarity and inference (2019)
Expert Systems with Applications. Volume 132, 15 October 2019, Pages 1-11. https://doi.org/10.1016/j.eswa.2019.04.054.
Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa and Eneko Agirre
Analyzing the Limitations of Cross-lingual Word Embedding Mappings (2019)
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4990-4995.
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre
Reproducibility dataset for a large experimental survey on word embeddings and ontology-based methods for word similarity (2019)
Data in Brief, Volume 26.
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre
A reproducible survey on word embeddings and ontology-based methods for word similarity: linear combinations outperform the state of the art (2019)
Engineering Applications of Artificial Intelligence. Volume 85, October 2019, Pages 645-665.
Andrea Amelio Ravelli, Oier Lopez de Lacalle, Eneko Agirre
A comparison of representation models in a non-conventional semantic similarity scenario (2019)
Proceedings of the Sixth Italian Conference on Computational Linguistics, Bari, Italy.
Rodrigo Agerri
Doris Martin at SemEval-2019 Task 4: Hyperpartisan News Detection with Generic Semi-supervised Features (2019)
SemEval@NAACL-HLT2019: 944-948 https://www.aclweb.org/anthology/S19-2161.pdf
Joseba Fernandez de Landa, Rodrigo Agerri, Iñaki Alegria
Euskaldun gazte eta helduen harremanak Twitterren (2019)
III. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Gizarte Zientziak eta Zuzenbidea. 2, pp. 83 - 90
Itziar Gonzalez-Dios, Javier Alvez, and German Rigau
Exploiting Metonymy from Available Knowledge Resources. (2019)
20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019). (TO APPEAR in LNCS)
Javier Álvez, Montserrat Hermo, Paqui Lucio, German Rigau
Automatic white-box testing of first-order logic ontologies (2019)
Journal of Logic and Computation, Volume 29, Issue 5, September 2019, Pages 723–751
Alvez,J; Lucio,P; Rigau,G
A Framework for the Evaluation of SUMO-Based Ontologies Using WordNet (2019)
IEEE Access, 7, 36075-36093. 2019
Mark Stevenson, Eneko Agirre
Word Sense Disambiguation (2018)
The Oxford Handbook of Computational Linguistics 2nd edition (2 ed.) Edited by Ruslan Mitkov. Oxford. ISBN: 9780199573691. DOI of the chapter: 10.1093/oxfordhb/9780199573691.013.28
Josu Goikoetxea, Aitor Soroa eta Eneko Agirre
Knowledge-Based Systems (KNOSYS). Volume 150, 15 June 2018, Pages 218-230. ISSN: 0950-7051. DOI https://doi.org/10.1016/j.knosys.2018.03.017 Preprint at https://arxiv.org/pdf/1804.08316.pdf
Rodrigo Agerri, Yiling Chung, Itziar Aldabe, Nora Aranberri, Gorka Labaka, German Rigau
Building Named Entity Recognition Taggers via Parallel Corpora (2018)
In Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), 7-12 May, 2018, Miyazaki, Japan.
Ander Barrena, Aitor Soroa, Eneko Agirre
Learning text representations for 500K classification tasks on Named Entity Disambiguation (2018)
The SIGNLL Conference on Computational Natural Language Learning CONLL 2018
Rodrigo Agerri, German Rigau
Simple Language Independent Sequence Labelling for the Annotation of Disabilities in Medical Texts (2018)
Proceedings of the Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Diann Track, Sevilla, Spain.
Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau
Multi-lingual and Cross-lingual timeline extraction (2017)
Knowledge-Based Systems, 133, 77-89
Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola
Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)
Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak
Goikoetxea J., Agirre E., Soroa A.
Single or Multiple. Combining Word Representations Independently Learned from Text and WordNet (2016)
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. pp. 2608-26014. ISBN: 978-1-57735-760-5. Phoenix (USA).
Rodrigo Agerri, German Rigau
Robust Multilingual Named Entity Recognition with Shallow Semi-supervised Features (2016)
Artificial Intelligence, 238 (2016) pages 63-82. http://dx.doi.org/10.1016/j.artint.2016.05.003
Hugo Manguinhas, Nuno Freire, Antoine Isaac, Juliane Stiller, Valentine Charles, Aitor Soroa, Rainer Simon, Vladimir Alexiev
Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata (2016)
Proceedings of the 20th International Conference on Theory and Practice of Digital Libraries, TPDL 2016, Vol 9818, pp 266-278
Ander Intxaurrondo, Eneko Agirre, Oier Lopez de Lacalle, Mihai Surdeanu
Diamonds in the Rough: Event Extraction from Imperfect Microblog Data (2015)file2 (2015)file3 (2015)
Proceedings of the North American chapter of the Association for Computational Linguistics (NAACL HLT), pages: 641-650. ISBN: 978-1-941643-49-5.
Goikoetxea J., Agirre E., Soroa A.
Random Walks and Neural Network Language Models on Knowledge Bases (2015)
Proceedings of the Annual Meeting of the North American chapter of the Association of Computational Linguistics (NAACL HLT 2015), pages 1434-1439. ISBN: 978-1-937284-73-2. Denver (USA).
Igor Leturia, Kepa Sarasola, Xabier Arregi, Arantza Diaz de Ilarraza, Eva Navas, Iñaki Sainz, Arantza del Pozo, David Baranda, Urtza Iturraspe
BerbaTek: euskararako hizkuntza teknologien garapena itzulpengintza, edukien kudeaketa eta irakaskuntza arloetan (2013)
Euskalingua aldizkari digitala, 23, 66-76. http://mendebalde.eus/euskalinguak/Euskalingua%2023/Berbatek:%20euskararako%20hizkuntza%20teknologien%20garapena%20itzulpengintza,%20edukien%20kudeaketa%20eta%20irakaskuntza%20arloetan.pdf
Mark Hall, Eneko Agirre, Nikolas Aletras, Runar Bergheim, Kostas Chandrinos, Paul Clough, Samuel Fernando, Kate Fernie, Paula Goodale, Jill Griffiths, Oier Lopez de Lacalle, Andrea de Polo, Aitor Soroa, Mark Stevenson
PATHS - Exploring Digital Cultural Heritage Spaces (2012)
Theory and Practice of Digital Libraries 2012. ISBN 9783642332906 ISSN 0302-9743
Arantxa Otegi
Hedapena informazioaren berreskurapenean: hitzen adiera-desanbiguazioaren eta antzekotasun semantikoaren ekarpenak (2012)file2 (2012)
Lengoaia eta Sistema Informatikoak Saila, EHU/UPV. Informatika Fakultatea. 2012/03/16
Iñaki Alegria, Bertol Arrieta, Arantza Diaz de Ilarraza, Elixabete Izagirre, Montse Maritxalar
Using Machine Learning Techniques to Build a Comma Checker for Basque (2006)
Proceedings of Coling-ACL 2006. Sydney. Australia.ISBN: 1-932432-69-8 pp.1-8. https://aclanthology.org/P06-4000/
A. Casillas, V. Fresno, R. Martínez, S. Montalvo
Evaluación del clustering de páginas web mediante funciones de peso y combinación heurística de criterios (2005)
Revista Española para el Procesamiento del Lenguaje Natural, 35, 417-424 .https://1library.co/document/yn4mkjpz-evaluacion-clustering-paginas-mediante-funciones-combinacion-heuristica-criterios.html