186 lines
		
	
	
		
			6.8 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			186 lines
		
	
	
		
			6.8 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # seek-bzip
 | |
| 
 | |
| [![Build Status][1]][2] [![dependency status][3]][4] [![dev dependency status][5]][6]
 | |
| 
 | |
| `seek-bzip` is a pure-javascript Node.JS module adapted from [node-bzip](https://github.com/skeggse/node-bzip) and before that [antimatter15's pure-javascript bzip2 decoder](https://github.com/antimatter15/bzip2.js).  Like these projects, `seek-bzip` only does decompression (see [compressjs](https://github.com/cscott/compressjs) if you need compression code).  Unlike those other projects, `seek-bzip` can seek to and decode single blocks from the bzip2 file.
 | |
| 
 | |
| `seek-bzip` primarily decodes buffers into other buffers, synchronously.
 | |
| With the help of the [fibers](https://github.com/laverdet/node-fibers)
 | |
| package, it can operate on node streams; see `test/stream.js` for an
 | |
| example.
 | |
| 
 | |
| ## How to Install
 | |
| 
 | |
| ```
 | |
| npm install seek-bzip
 | |
| ```
 | |
| 
 | |
| This package uses
 | |
| [Typed Arrays](https://developer.mozilla.org/en-US/docs/JavaScript/Typed_arrays), which are present in node.js >= 0.5.5.
 | |
| 
 | |
| ## Usage
 | |
| 
 | |
| After compressing some example data into `example.bz2`, the following will recreate that original data and save it to `example`:
 | |
| 
 | |
| ```
 | |
| var Bunzip = require('seek-bzip');
 | |
| var fs = require('fs');
 | |
| 
 | |
| var compressedData = fs.readFileSync('example.bz2');
 | |
| var data = Bunzip.decode(compressedData);
 | |
| 
 | |
| fs.writeFileSync('example', data);
 | |
| ```
 | |
| 
 | |
| See the tests in the `tests/` directory for further usage examples.
 | |
| 
 | |
| For uncompressing single blocks of bzip2-compressed data, you will need
 | |
| an out-of-band index listing the start of each bzip2 block.  (Presumably
 | |
| you generate this at the same time as you index the start of the information
 | |
| you wish to seek to inside the compressed file.)  The `seek-bzip` module
 | |
| has been designed to be compatible with the C implementation `seek-bzip2`
 | |
| available from https://bitbucket.org/james_taylor/seek-bzip2.  That codebase
 | |
| contains a `bzip-table` tool which will generate bzip2 block start indices.
 | |
| There is also a pure-JavaScript `seek-bzip-table` tool in this package's
 | |
| `bin` directory.
 | |
| 
 | |
| ## Documentation
 | |
| 
 | |
| `require('seek-bzip')` returns a `Bunzip` object.  It contains three static
 | |
| methods.  The first is a function accepting one or two parameters:
 | |
| 
 | |
| `Bunzip.decode = function(input, [Number expectedSize] or [output], [boolean multistream])`
 | |
| 
 | |
| The `input` argument can be a "stream" object (which must implement the
 | |
| `readByte` method), or a `Buffer`.
 | |
| 
 | |
| If `expectedSize` is not present, `decodeBzip` simply decodes `input` and
 | |
| returns the resulting `Buffer`.
 | |
| 
 | |
| If `expectedSize` is present (and numeric), `decodeBzip` will store
 | |
| the results in a `Buffer` of length `expectedSize`, and throw an error
 | |
| in the case that the size of the decoded data does not match
 | |
| `expectedSize`.
 | |
| 
 | |
| If you pass a non-numeric second parameter, it can either be a `Buffer`
 | |
| object (which must be of the correct length; an error will be thrown if
 | |
| the size of the decoded data does not match the buffer length) or
 | |
| a "stream" object (which must implement a `writeByte` method).
 | |
| 
 | |
| The optional third `multistream` parameter, if true, attempts to continue
 | |
| reading past the end of the bzip2 file.  This supports "multistream"
 | |
| bzip2 files, which are simply multiple bzip2 files concatenated together.
 | |
| If this argument is true, the input stream must have an `eof` method
 | |
| which returns true when the end of the input has been reached.
 | |
| 
 | |
| The second exported method is a function accepting two or three parameters:
 | |
| 
 | |
| `Bunzip.decodeBlock = function(input, Number blockStartBits, [Number expectedSize] or [output])`
 | |
| 
 | |
| The `input` and `expectedSize`/`output` parameters are as above.
 | |
| The `blockStartBits` parameter gives the start of the desired block, in bits.
 | |
| 
 | |
| If passing a stream as the `input` parameter, it must implement the
 | |
| `seek` method.
 | |
| 
 | |
| The final exported method is a function accepting two or three parameters:
 | |
| 
 | |
| `Bunzip.table = function(input, Function callback, [boolean multistream])`
 | |
| 
 | |
| The `input` and `multistream` parameters are identical to those for the
 | |
| `decode` method.
 | |
| 
 | |
| This function will invoke `callback(position, size)` once per bzip2 block,
 | |
| where `position` gives the starting position of the block (in *bits*), and
 | |
| `size` gives the uncompressed size of the block (in bytes).
 | |
| 
 | |
| This can be used to construct an index allowing direct access to a particular
 | |
| block inside a bzip2 file, using the `decodeBlock` method.
 | |
| 
 | |
| ## Command-line
 | |
| There are binaries available in bin.  The first generates an index of all
 | |
| the blocks in a bzip2-compressed file:
 | |
| ```
 | |
| $ bin/seek-bzip-table test/sample4.bz2
 | |
| 32	99981
 | |
| 320555	99981
 | |
| 606348	99981
 | |
| 847568	99981
 | |
| 1089094	99981
 | |
| 1343625	99981
 | |
| 1596228	99981
 | |
| 1843336	99981
 | |
| 2090919	99981
 | |
| 2342106	39019
 | |
| $
 | |
| ```
 | |
| The first field is the starting position of the block, in bits, and the
 | |
| second field is the length of the block, in bytes.
 | |
| 
 | |
| The second binary decodes an arbitrary block of a bzip2 file:
 | |
| ```
 | |
| $ bin/seek-bunzip -d -b 2342106 test/sample4.bz2 | tail
 | |
| élan's
 | |
| émigré
 | |
| émigré's
 | |
| émigrés
 | |
| épée
 | |
| épée's
 | |
| épées
 | |
| étude
 | |
| étude's
 | |
| études
 | |
| $
 | |
| ```
 | |
| 
 | |
| Use `--help` to see other options.
 | |
| 
 | |
| ## Help wanted
 | |
| 
 | |
| Improvements to this module would be generally useful.
 | |
| Feel free to fork on github and submit pull requests!
 | |
| 
 | |
| ## Related projects
 | |
| 
 | |
| * https://github.com/skeggse/node-bzip node-bzip (original upstream source)
 | |
| * https://github.com/cscott/compressjs
 | |
|   Lots of compression/decompression algorithms from the same author as this
 | |
|   module, including bzip2 compression code.
 | |
| * https://github.com/cscott/lzjb fast LZJB compression/decompression
 | |
| 
 | |
| ## License
 | |
| 
 | |
| #### MIT License
 | |
| 
 | |
| > Copyright © 2013-2015 C. Scott Ananian
 | |
| >
 | |
| > Copyright © 2012-2015 Eli Skeggs
 | |
| >
 | |
| > Copyright © 2011 Kevin Kwok
 | |
| >
 | |
| > Permission is hereby granted, free of charge, to any person obtaining
 | |
| > a copy of this software and associated documentation files (the
 | |
| > "Software"), to deal in the Software without restriction, including
 | |
| > without limitation the rights to use, copy, modify, merge, publish,
 | |
| > distribute, sublicense, and/or sell copies of the Software, and to
 | |
| > permit persons to whom the Software is furnished to do so, subject to
 | |
| > the following conditions:
 | |
| >
 | |
| > The above copyright notice and this permission notice shall be
 | |
| > included in all copies or substantial portions of the Software.
 | |
| >
 | |
| > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 | |
| > EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 | |
| > MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
 | |
| > NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
 | |
| > LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 | |
| > OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
 | |
| > WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 | |
| 
 | |
| [1]: https://travis-ci.org/cscott/seek-bzip.png
 | |
| [2]: https://travis-ci.org/cscott/seek-bzip
 | |
| [3]: https://david-dm.org/cscott/seek-bzip.png
 | |
| [4]: https://david-dm.org/cscott/seek-bzip
 | |
| [5]: https://david-dm.org/cscott/seek-bzip/dev-status.png
 | |
| [6]: https://david-dm.org/cscott/seek-bzip#info=devDependencies
 |