Please use this identifier to cite or link to this item: https://idr.nitk.ac.in/jspui/handle/123456789/6981
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTrotman, A.-
dc.contributor.authorSubramanya, V.-
dc.date.accessioned2020-03-30T09:46:33Z-
dc.date.available2020-03-30T09:46:33Z-
dc.date.issued2007-
dc.identifier.citationInternational Conference on Information and Knowledge Management, Proceedings, 2007, Vol., , pp.983-986en_US
dc.identifier.urihttp://idr.nitk.ac.in/jspui/handle/123456789/6981-
dc.description.abstractCompression of term frequency lists and very long document-id lists within an inverted file search engine are examined. Several compression schemes are compared including Elias ? and ? codes, Golomb Encoding, Variable Byte Encoding, and a class of word- based encoding schemes including Simple-9, Relative-10 and Carryover-12. It is shown that these compression methods are not well suited to compressing these kinds of lists of numbers. Of those tested, Carryover-12 is preferred because it is both effective at compression and fast at decompression. A novel technique, Sigma Encoding prior to compression, is proposed and tested. Sigma Encoding utilizes a parameterized dictionary to reduce the number of bits necessary to store an integer. This method shows an about 0.3 bit per integer improvement over Carryover-12 while costing only about 3 extra clock cycles per integer to decompress. Copyright 2007 ACM.en_US
dc.titleSigma encoded inverted filesen_US
dc.typeBook chapteren_US
Appears in Collections:2. Conference Papers

Files in This Item:
File Description SizeFormat 
6981.pdf217.68 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.