D 2012

Low-cost ontology development

GRÁC, Marek and Adam RAMBOUSEK

Basic information

Original name

Low-cost ontology development

Name in Czech

Levný vývoj ontologie

Authors

GRÁC, Marek (703 Slovakia, belonging to the institution) and Adam RAMBOUSEK (203 Czech Republic, guarantor, belonging to the institution)

Edition

Matsue, Japan, 6th International Global Wordnet Conference Proceedings, p. 299-304, 6 pp. 2012

Publisher

Toyohashi University of Technology

Other information

Language

English

Type of outcome

Proceedings paper

Field of Study

Informatics

Country of publisher

Japan

Confidentiality degree

is not subject to a state or trade secret

Publication form

printed version "print"

RIV identification code

RIV/00216224:14330/12:00057239

Organization

Fakulta informatiky – Repository – Repository

ISBN

978-80-263-0244-5

Keywords (in Czech)

ontologie; WordNet; anotace; VerbaLex

Keywords in English

ontology; WordNet; annotation; VerbaLex

Links

GAP401/10/0792, research and development project. GA102/09/1842, research and development project. LC536, research and development project. LM2010013, research and development project. VF20102014003, research and development project.
Changed: 9/7/2022 02:58, RNDr. Daniel Jakubík

Abstract

V originále

In this paper, we present the project building new lexical resource -- shallow ontology derived from the corpora. The ontology should be used primarily for machine translation, syntactic parsing and word sense disambiguation. Currently, the ontology for Czech language is developed, but the methodology and tools are suitable for other languages with similar structure. Ontology is based on BushBank corpus, which improves handling of ambiguity in natural language. BushBank data and tools are application-driven, thus reducing the time and costs needed to annotate the corpora and develop new lexical resources.

In Czech

V článku je představen projekt budování nového lexikálního zdroje - mělké ontologie odvozené z korpusu. Ontologie by měla být primárně použita pro strojový překlad, syntaktické parsování a dezambiguaci významu slov. V současné době probíhá tvorba ontologie pro češtinu, ale metodologie a nástroje jsou vhodné i pro další jazyky s podobnou strukturou. Ontologie je založena na korpusu BushBank, který vylepšuje práci s nejednoznačnostmi v přirozeném jazyce. Data a nástroje BushBank jsou založena na aplikacích, tím se redukuje čas a náklady potřebné k anotaci korpusu a tvorby lexikálních zdrojů.

Files attached