Editing
Maximal Parse Accuracy?
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
Title: Maximal Parse Accuracy? Authors: Rens Bod Abstract: This research aims to find the minimal set of fragments that achieves maximal parse accuracy in Data Oriented Parsing. The study uses the Penn Wall Street Journal treebank and investigates several strategies for constraining the set of subtrees. The results show that an upper bound of the number of words in the subtree frontiers and an upper bound on the depth of unlexicalized subtrees do not decrease the parse accuracy. Additionally, the study found that counts of subtrees with several nonheadwords are important, leading to improved parse accuracy over previous parsers tested on the WSJ. Main Research Question: What is the minimal set of fragments that achieves maximal parse accuracy in Data Oriented Parsing? Methodology: The study uses the Penn Wall Street Journal treebank, a large collection of parsed sentences. The Data Oriented Parsing (DOP) model, which takes a very large and extremely redundant set of subtrees, is used as a basis. The research investigates several strategies for constraining this set of subtrees. Results: The study finds that an upper bound of the number of words in the subtree frontiers and an upper bound on the depth of unlexicalized subtrees do not decrease the parse accuracy. Furthermore, counts of subtrees with several nonheadwords are found to be important, resulting in improved parse accuracy. Implications: This research suggests that it is possible to impose constraints on the subtrees used in the DOP model without deteriorating or improving the parse accuracy. It also highlights the importance of considering counts of subtrees with several nonheadwords in achieving maximal parse accuracy. Conclusion: In conclusion, the study finds that an upper bound of the number of words in the subtree frontiers and an upper bound on the depth of unlexicalized subtrees do not decrease the parse accuracy. Additionally, the research highlights the importance of considering counts of subtrees with several nonheadwords in achieving maximal parse accuracy. Link to Article: https://arxiv.org/abs/0110050v1 Authors: arXiv ID: 0110050v1 [[Category:Computer Science]] [[Category:Parse]] [[Category:Accuracy]] [[Category:Subtrees]] [[Category:Several]] [[Category:Upper]]
Summary:
Please note that all contributions to Simple Sci Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Simple Sci Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information