Editing
Improving Table Compression with Combinatorial Optimization
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
Title: Improving Table Compression with Combinatorial Optimization Abstract: This research focuses on improving the compression of massive tables using a partition-training paradigm. The authors provide a new theory that unifies previous experimental observations and heuristic column permutations, all of which are used to improve compression rates. They develop the first on-line training algorithms for table compression, which can be applied to individual files and not just continuously operating sources. They also create an off-line training algorithm based on an asymmetric traveling salesman problem link, which improves on prior work by rearranging columns before partitioning. Experimental results support these conclusions. The authors also show that a variation of the table compression problem is MAX-SNP hard. Main Research Question: How can we improve the compression rates of massive tables using a partition-training paradigm? Methodology: The authors study the problem of compressing massive tables within the partition-training paradigm introduced by Buchsbaum et al. They provide a new theory that unifies previous experimental observations and heuristic column permutations. They develop the first on-line training algorithms for table compression and an off-line training algorithm. Results: The on-line algorithms provide 35-55% improvement over gzip with negligible slowdown. The off-line reordering provides up to 20% further improvement over partitioning alone. The authors also show that a variation of the table compression problem is MAX-SNP hard. Implications: The research has implications for the compression of massive tables, particularly those generated continuously by operating sources. The new algorithms and theory provide a more effective way to compress such tables, reducing storage and network bandwidth requirements. Link to Article: https://arxiv.org/abs/0203018v1 Authors: arXiv ID: 0203018v1 [[Category:Computer Science]] [[Category:Compression]] [[Category:Training]] [[Category:Line]] [[Category:Table]] [[Category:Tables]]
Summary:
Please note that all contributions to Simple Sci Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Simple Sci Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information