8 ressourcer fundet

Licenser: CLARIN-ACA-NC

Filtrér resultater
  • Digitalisering og opmærkning af trusselsbreve til projektet 'Truslers sprog og genre', der bygger på en innovativ kombination af sprogvidenskab og genrestudier med det formål at...
    • XML
  • DK-CLARIN Reference Corpus of General Danish has been collected as part of DK-CLARIN project, WP2.1, 2008 - 2011. All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with...
    • XML
  • Binary wordlists for the CST lemmatizer as suplement to the rules of the lemmatizer. Works with both tagged and untagged input. Use: cstlemma -d NAME-OF-WORDLIST
    • HTML
  • The SemDax Corpus is a Danish human-annotated corpus relying on the combined wordnet and dictionary resources: DanNet and Den Danske Ordbog, and available through a CLARIN...
    • XML
  • The aligned corpus consists of press releases from the European Commission Press Relase Database (Rapid) harvested in 2009 and 2011 (http://europa.eu/rapid/search.htm). The...
    • TXT
    • TMX
  • DUDS Jens Bille’s Ballad Book belongs to a corpus of the oldest Danish ballad tradition. The corpus consists of 9 ballad books handed down from Renaissance ballad collectors...
    • XML
  • The DK-CLARIN Parallel Financial Corpus comprises 4.3 M Danish and 4.8 M English tokens from translated (parallel) documents, mainly annual reports, of the period 2002-2010 from...
    • XML
  • The LSP (Language for Special Purposes) corpus consists of texts from seven selected domains. The DK-CLARIN LSP corpus comprises 11 M tokens from the period 2000-2010,...
    • XML