FBK > IT > HLT

Technology

Research results lead to open source software and solutions for the industry.

The HLT unit develops state-of-the-art technology in all the main research areas it operates in. The group has performed consistently well in several international evaluations, and is currently engaged in international projects for open source software development (e.g. the Moses platform for statistical machine translation). Research on speech recognition also meets the highest standards, and has reached the application market in several occasions.
Moreover, people of the unit are key-players of many international initiatives around evaluation and benchmarking. HLT provides technological support and high-level services in order to optimize the activities of the Research Unit. Providing a shared and efficient environment, specific for the HLT issues, ranges from the management of special hardware equipments and software tools, up to the creation and management of large scale linguistic resources.

Software

  • EDITS (Edit Distance Textual Entailment Suite): an open source software package aimed at recognizing entailment relations between two portions of text
  • TextPro: a suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts.
  • Moses: a phrase-based decoder for statistical machine translation
  • IRSTLM: a toolkit for statistical language modeling
  • jSRE: an open source Java tool for Relation Extraction
  • jWeb1T: an open source Java tool for efficiently searching the Web 1T 5-gram corpus
  • The Tool-box for lexicographers: a web-based application for accessing and updating lexical resources
  • jFex and jInFil: java tools for Feature Extraction and Instance Filtering
  • jExSLI: an open source java tool for language identification
  • jWebS: a software tool for Web people search
  • jTCat: a software tool for text categorization
  • StringKernel: an implementation of the string kernel
  • jLSI - an open source Java tool for Latent Semantic Indexing

Databases

  • MultiWordNet: a Multilingual (English/Italian) Lexical Database
  • WordNet Domains: a systematic labelling of WordNet synsets with domain labels; it includes WordNet-Affect, an additional labeling of the synsets representing affective concepts with "affective" domain labels

Corpora

Electronic Dictionaries/Spell Checkers

Demos

  • TextPro: a suit of tools for analysis of English and Italian texts
  • The Wiki Machine: a tool for linking to Wikipedia