emDep - Dependency parser
About the tool
What is it good for? What does it do?
The tool reveals the dependency relations between the structural units (words, multiword expressions) of a sentence.
What is the input?
Text that had been tokenised and morphologically disambiguated.
What is the output?
Sentences, the words of which are arranged in so-called parse trees, which reveal the dependency relations between the units of the sentence. Every token is assigned the appropriate analysis tag and its parent node, the head.
An example:
Az exkatonát kórházba szállították, ahol két műtétet is végrehajtottak rajta.
1 | Az | az | DET | Definite=Def|PronType=Art | 2 | DET |
2 | exkatonát | exkatona | NOUN | Case=Acc|Number=Sing | 4 | OBJ |
3 | kórházba | kórház | PROPN | Case=Ill|Number=Sing | 4 | OBL |
4 | szállították | szállít | VERB | Definite=Def|Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin|Voice=Act | 0 | ROOT |
5 | , | , | PUNCT | _ | 4 | PUNCT |
6 | ahol | ahol | ADV | PronType=Rel | 10 | LOCY |
7 | két | két | NUM | Case=Nom|NumType=Card|Number=Sing | 8 | ATT |
8 | műtétet | műtét | NOUN | Case=Acc|Number=Sing | 10 | OBJ |
9 | is | is | CONJ | _ | 8 | CONJ |
10 | végrehajtottak | végrehajt | VERB | Definite=Ind|Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin|Voice=Act | 4 | ATT |
11 | rajta | rajta | PRON | Case=Sup|Number=Sing|Person=3|PronType=Prs | 10 | OBL |
12 | . | . | PUNCT | _ | 0 | PUNCT |
For developers
Source | http://rgai.inf.u-szeged.hu/magyarlanc |
Source code | Java |
Input | Input is the output of the POS tagger (one token per row, separate column for the word form with lemma and morphological analysis), the respective sentences divided by an empty line. |
Output | One token per row, a separate column for word form, lemma, morphological analysis, parent node and syntactic tag. |
Execution | java -Xmx2G -jar magyarlanc-3.0.jar -mode depparse -input in.txt -output out.txt |
Licence | The database is licensed under the Creative Commons Attribution-ShareAlike 4.0 (CC-BY-SA) licence. GNU General Public License (GPL v3) converts the primary source of the database) |