Editing
Measuring Effective Similarity: A Universal Metric Approach
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
Title: Measuring Effective Similarity: A Universal Metric Approach Abstract: The research question at hand is how to measure similarity effectively between sequences, such as internet documents, different language text corpora, computer programs, or chain letters. The study proposes a new "normalized information distance" metric, based on the noncomputable notion of Kolmogorov complexity. This metric is shown to be universal, meaning it can discern all effective similarities. The metric is found to be a metric itself, taking values in the range [0, 1], and is thus aptly named the "similarity metric". The paper presents two applications in widely divergent areas: comparing whole mitochondrial genomes to infer evolutionary history and constructing a language tree for 52 different languages based on translated versions of the "Universal Declaration of Human Rights". The implications of this research are significant, as it provides a practical tool for measuring similarity and comparing sequences across various fields. Main Research Question: How can we measure similarity effectively between sequences? Methodology: The study proposes a new "normalized information distance" metric, which is based on the noncomputable notion of Kolmogorov complexity. This metric is shown to be universal, meaning it can discern all effective similarities. Results: The "normalized information distance" metric is found to be a metric itself, taking values in the range [0, 1]. The paper presents two applications in widely divergent areas: comparing whole mitochondrial genomes and constructing a language tree for 52 different languages. Implications: The research provides a practical tool for measuring similarity and comparing sequences across various fields. The universal nature of the "normalized information distance" metric means that it can be applied to a wide range of applications, making it a versatile and valuable tool in various fields. Link to Article: https://arxiv.org/abs/0111054v2 Authors: arXiv ID: 0111054v2 [[Category:Computer Science]] [[Category:Metric]] [[Category:Similarity]] [[Category:Universal]] [[Category:Be]] [[Category:It]]
Summary:
Please note that all contributions to Simple Sci Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Simple Sci Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information