Editing
Information Extraction Using the Structured Language Model
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
Title: Information Extraction Using the Structured Language Model Abstract: This research paper presents a data-driven approach to information extraction, viewed as template filling, using the Structured Language Model (SLM) as a statistical parser. The task of template filling is cast as constrained parsing using the SLM. The model is automatically trained from a set of sentences annotated with frame/slot labels and spans. Training proceeds in stages: first, a constrained syntactic parser is trained such that the parses on training data meet the specified semantic spans, then the non-terminal labels are enriched to contain semantic information, and finally, a constrained syntactic+semantic parser is trained on the parse trees resulting from the previous stage. Despite the small amount of training data used, the model outperforms the slot level accuracy of a simple semantic grammar manually authored for the MiPad - personal information management - task. Main Research Question: Can the Structured Language Model (SLM) be used to improve information extraction performance by automatically training from annotated data, without requiring manual grammar authoring expertise? Methodology: The research paper proposes a data-driven approach to information extraction using the SLM. The model is trained on sentences annotated with frame/slot labels and spans, and the training process is divided into stages. First, a constrained syntactic parser is trained to ensure that the parses meet the specified semantic spans. Then, the non-terminal labels are enriched with semantic information, and finally, a constrained syntactic+semantic parser is trained on the parse trees resulting from the previous stages. Results: The research paper shows that the SLM-based approach outperforms the slot level accuracy of a simple semantic grammar manually authored for the MiPad task. This suggests that the SLM can be effectively used for information extraction without requiring manual grammar authoring expertise. Implications: The research has several implications for the field of information extraction. First, it demonstrates that the SLM can be used to automatically train a model for information extraction, reducing the need for manual grammar authoring. Second, it provides a new approach to information extraction that can potentially improve performance by training on annotated data. Finally, it opens up new possibilities for developing more efficient and accurate information extraction systems. Link to Article: https://arxiv.org/abs/0108023v1 Authors: arXiv ID: 0108023v1 [[Category:Computer Science]] [[Category:Information]] [[Category:Extraction]] [[Category:Semantic]] [[Category:Model]] [[Category:Slm]]
Summary:
Please note that all contributions to Simple Sci Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Simple Sci Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information