Personal tools

The Language Archive -

Projects

TLA will continue to participate in projects together with international partners. Currently, TLA is involved in the following projects.

TLA will continue to participate in projects together with international partners. Currently, TLA is involved in the following projects:

AVATecH - Advancing Video Audio Technology in Humanities Research (funded by MPG and FhG). The goal of this project is to investigate and develop technology for semi-automatic annotation of audio and video recordings used in humanities research. Detectors that will be available via interactive annotation tools and via batch processing can help for example with automatic segmentation, pattern detection and annotation and finding complex pattern sequences.

CLARA (funded by the EC) is training a new generation of researchers who will be able to cooperate across national boundaries on the establishment of a common language resources infrastructure and its exploitation for the construction of the next generation of language models with wide theoretical and applied significance. The MPI is focussing on AVATecH like activities within CLARA. CLARA is strongly related with the CLARIN goals.

 Clarin LogoCLARIN - Common Language Resources and Technology Infrastructure (funded by the EC) - is committed to establish an integrated and interoperable research infrastructure of language resources and its technology. The purpose of the infrastructure is to offer persistent services that are secure and provide easy access to resources and language processing resources. It aims at lifting the current fragmentation, offering a stable, persistent, accessible and extendable infrastructure and therefore enabling eHumanities.

CLARIN-NL (funded by NWO) is the national project that is implementing CLARIN infrastructure in the Netherlands. This Dutch program offers scholars the tools to allow computer-aided language processing, addressing one or more of the multiple roles language plays (i.e. carrier of cultural content and knowledge, instrument of communication, component of identity and object of study) in the Humanities and Social Sciences. Subprojects CLARIN-NL.

CLARIN-D (formerly D-SPIN; funded by BMBF) is the national project which is implementing the CLARIN infrastructure in Germany. It has special foci on training and education programs, web-services frameworks, integrating all available language resources and tools and offering its services to the researchers in the Humanities and Social Sciences.

The DASISH (Data Service Infrastructure for the Social Sciences and Humanities, funded by the EC) project brings together all 5 ESFRI research infrastructure initiatives in the social sciences and humanities (SSH) represented each by some centers: CLARIN, DARIAH, CESSDA, ESS, SHARE. The goal is to determine areas of possible synergies in the infrastructure development and to work on a few concrete joint activities. For TLA this is a very interesting opportunity to disseminate resources and tools to other disciplines and integrate good components from others in the CLARIN infrastructure.

 Dobes logoDoBeS (funded by VolkswagenFoundation) stands for Documentation Bedrother Sprachen, or in English; Documentation of endangered languages. In 2000 the VolkswagenFoundation started the DoBeS program to document endangered languages and to archive the results in a proper and easily accessible way. Currently 47 teams are operating world wide to document about 70 languages which are in danger of becoming extinct.


EUDAT (European Data Infrastructure, funded by the EC) is a first consequence of the report “Riding the Wave” of the EC’s High Level Expert Group on Scientific Data in so far as it brings together 13 community driven infrastructure initiatives and 10 data centers to build a first prototype of a Collaborative Data Infrastructure (CDI). In such a CDI the community infrastructures take care of user oriented services on data, the data centers take care of common horizontal data services which are the same or at least very similar for all research disciplines and where both need to address topics such as data curation and establishment of trust between all stakeholders. EUDAT will focus on professional and robust common services such as: (1) providing an easy deposit for all involved researchers, (2) setup a distributed architecture allowing the participating centers to easily store large data volumes for preservation and access purposes (which includes a safe replication of data), (3) working on a policy-rules based replication at logical level of collections, (4) testing generic web services execution frameworks.

HARVE, INTER and RoR are projects which are being carried out in collaboration with the Max Planck Digital Library (funded by MPG). HARVE is establishing an archive federation between 3 Max Planck Institutes (Social Anthropology-Halle, Human Development-Berlin, Psycholinguistics-Nijmegen) to easily exchange and access data of mutual interest.
INTER is enabling data exchange between eSciDoc and LAMUS which are the two repository systems used by the MPDL and the Psycholinguistics institute based on the METS standard.
RoR is implementing cross-disciplinary metadata frameworks based on component-based schema principles, registered and well-defined semantic elements and persistent identifiers such as offered by EPIC.

The INNET (Innovative Networking in Infrastructure for Endangered Languages, funded by the EC) project will strengthen our international activities which where started in the DOBES project on the one hand and in CLARIN on the other. Together with the University of Cologne and colleagues from Poznan and Budapest we will set up 3 new regional archives and run annual workshops with all experts active in the current and coming regional centers. Best practice meetings with international guests and summer schools will be organized and we will work out educational material to go into schools to get pupils’ attention.

The Radieschen (Rahmenbedingungen einer disziplinübergreifenden Forschungsdateninfrastruktur, funded by DFG) project can be compared with the EUDAT project in so far as it tries to define the basis and roadmap for a future data infrastructure for the research domain in Germany. While EUDAT is already meant to come up with concrete services, Radieschen will make many interviews with experts from different stakeholders which will be analyzed in a few major dimensions with the goal to come up with a suggestion how the Collaborative Data Infrastructure can be realized in Germany with its federal organization structure.

The RELISH project (funded by NEH and DFG)- rendering endangered languages lexicons interoperable through standards harmonisation - will match key European and American digital standards for lexicons. Until now, the divergence of these lexicons has impeded international collaboration on language technology for resource creation and analysis, as well as web services for archive access.

The REPLIX (funded by DEISA, CLARIN and DoBeS) project is studying and implementing the next level in grid based replication and synchronization of research data at a logical level to come to a safe and policy-based data copying. Currently for our test implementation iRODS is used. REPLIX is a joint project between Rechenzentrum Garching and MPI for psycholinguistics.

The project TextGrid aims to support TextGrid logoaccess to and exchange of data in the arts and humanities by means of grid technology. In 2006 development of a web-based platform began, one which will provide services and tools for researchers for analysis of text data in various digital archives - independently of data format, location and software. In this project TLA will initially focus on integrating LEXUS with TextGrid.

Last checked 2012-04-17 by Alexander Koenig

Max Planck Institute
for Psycholinguistics


Street address
Wundtlaan 1
6525 XD Nijmegen
The Netherlands


Mailing address
P.O. Box 310
6500 AH Nijmegen
The Netherlands

Phone:   +31-24-3521911
Fax:        +31-24-3521213
E-mail:   

Image right

scrabble