demos

Analhitza

Online tool to extract linguistic information from a corpus

AnCora

AnCora consists of a Basque corpus (EPEC-EU), a Spanish corpus (ANCORA-CAS) and a Catalan corpus (ANCORA-CAT).

Anhitz

A Basque-Speaking Virtual 3D Expert on Science and Technology

BASYQUE

A web application to analyse syntactic variation of Basque dialects

Berbatek_bikoizketa

Automatic dubbing of documentaries.

Berbatek_Irakasle

Personal teacher for language learning

BertsolariXa

Finds words ended by a given rhyme.

Biografix

Biografix is a multilingual NLP tool that removes the parenthetical biographical structures and creates new sentences out of them.

Diccionario Básico Escolar

Students basic dictionary (Cuba).

DoQA

A dataset for Domain specific FAQs via conversational QA

e-ROLda

A tool for looking up verb entries in the BVI lexicon and examples in EPEC-RolSem corpus

EDBL

EDBL lexical database

EDGK

Rule-based Dependency Grammar for Basque

EDIEC

Basque Disambiguated Named Entities Corpus

EIEC

Basque Named Entities Corpus

Eihera

Basque named entities recognizer/classifier

Elhuyar-Word

Dictionary system integrated into the Word 2000 text-processor.

ElkarHizketak

Conversational Question Answering dataset in Basque

EPEC-DEP (BDT)

A syntactic corpus tagged using the Dependency Grammar Theory

EPEC-EuSemcor

Corpus tagged with Basque WordNet senses

EPEC-KORREF

Basque Correference Corpus

Erreus

A database system for storing errors

Eulia

Environment for text tagging

EusEduSeg: euskarazko diskurtso segmentatzaile automatikoa

EusEduSeg tresnak testu-fitxategia aditza duten perpaus adberbial edo adjuntuetan zatitzen du hiru formatu ezberdinetan: i) testu fitxategia, lerro saltoekin, ii) RS3 formatuan, RSTTool tresnarekin erlaziozko diskurtso egitura etiketatzeko eta iii) DiZer analizatzaile diskurtsibo automatikoan erabiltzeko.

Euskal RST Treebank

Basque RST relation- and tree-bank

Euskarazko Wikipediaren esportazioa (2016ko apirilak 7ko bertsioa)

Basque wikipedia exportation

EUSMT

Statistical Machine Translation from Spanish to Basque

Eustagger

Basque lemmatizer and morphosyntactic analyzer

EusWN

Basque Wordnet

Gero Corpus Historikoa

Datasets for modernising historical Basque words

Ihardetsi

A Question-Answering system for the area of Science and Technology

IXA pipes: Hizkuntzaren Prozesamendurako tresnak

Multilingual NLP tools

ixaKat

A modular chain of Natural Language Processing tools for Basque

Ixati

Chunker

Konbitzul

Online database of Spanish-Basque Multiword Expression translation

leXkit

Generic XML-based Dictionary CMS

LibiXaml

Library for integrating several linguistic processors

Maltixa

Statistic-based dependency parser

Maria chatbot

Mary is capable of answering questions about a person or something else that is on Wikipedia in 3 languages: Basque, Spanish and English.

MateIXA

Mate statistic parser for Basque

Matxin

Machine translation from Spanish to Basque

MCR: Multilingual Central Repository

Multilingual lexical database with wordnets for several European languages.

Morfeus

Morphological analyzer

Multimeteo euskaraz

Generación automática de partes meteorológicos

NLTK-eu

Some Basque and Spanish resources to use with NLTK (Natural Language ToolKit)

Opentrad

Machine translation system

QLDB

Lexical database of the Quechua language

Spanish AMR Corpus

Spanish AMR corpus

TZOS-rdf

RDF representation of TZOS terminology

UKB

Graph-based word sense disambiguation and similarity

Universal Dependencies treebank for Basque

Universal Dependencies treebank for Basque

VecMap: cross-lingual word embedding mappings

Open source implementation of our framework to learn cross-lingual word embedding mappings and produce bilingual dictionaries

Web Corpus

Automatically analyzed 150 million word corpus (up to syntactic level)

WordNetetik DBpediarako mapaketa

A mapping from English WordNet 3.0 URIs to DBpedia 3.9 URIs

WSD-IXA

Word-Sense Disambiguation

Xuxen

Basque spelling corrector on-line

ZT Corpusa

Morphosyntactically-tagged Science and Technology corpus.

Speech2speech

Multilingual Speech 2 speech demo.

Transcripts, translates and synthesizes a spoken text.

S2S Demo

ENRICH Demo

The goal of the EU-funded ENRICH project is to modify or augment speech with additional information to make it easier to process. Enrichment reduces listening burden by minimising cognitive load, while maintaining or improving intelligibility. In this demo voice enrichment is performed applying the developed techniques.

ENRICH Demo

AhoTTS Demo
AhoTTS is the Aholab Signal Processing Laboratory  Text to Speech Synthesizer for Basque and Spanish. You can try it in the following link

AhoTTS Demo

Bizkaifon

Bizkaifon: speech and video database for the Western dialects of the Basque Language

Bizkaifon is a multimodal (speech and video) database containing thousands recordings of the many different western dialects of Basque. Most of them are transcribed to Standard Basque. It is accessible via web through this page. The database is available at ELRA.

AhoSR: Automatic Speech Recognizer for Basque
CALL-CAPT Aplications demos can be accessed here.
Voice conversion demo
Accesss some voice conversion samples here.
Basque Speech Recognizer
Implemented in the Elhuyar Hiztegia basque dictionary.
AhoTTS public use
AhoTTS is being use in some public media. Find it at any news from eitb.eus or berria.eus.
Speech intelligibility
Enhancing speech intelligibility in noise. Demo
AhoTTS Linguistic Module (Module 1)

AhoTTS Linguistic Module performes the normalization of a Basque written text and provides an xml formatted file with the normalized and POS annotated text. The module can be executed from this link.

Iparrahotsa: AhoTTS for Lapurdian Basque

Iparrahotsa is the AhoTTS version for Lapurdian Basque.This project has been funded by the Euroregion Aquitaine Euskadi under grant EUSKADI 2012-004. You can test it in this demo shown  or download it from sourceforge. 

sf-download-button

Automatic Speech Dubbing Demo

In the context of BERBATEK research project an automatic speech dubbing demo has been developed. The demo integrates three main technologies: automatic text and audio alignment, machine translation and text to speech synthesis. Subtitles for documentary videos in Spanish are automatically generated using the audio and the corresponding script. The subtitles are then translated to Basque and the new audio signal in Basque is generated using AhoTTS.