Firefox Download Button

Pages

splittar

What is it?
splittar is a small utility written in Python. It will create multiple tar files from a set of data with each tar file being limited in size. This is useful for archiving data onto removable media such as CD or DVD. Other solutions that I’ve found rely on splitting one giant tar file, rendering all but the first tar files useless on its own. With splittar each file that is created is a valid tar file that is useful on its own.

Where do I get it?

Download it here:

You’ll also need Python version 2.4 installed.

How do I install it?

If you downloaded the .tar.gz version, running setup.py install should do the trick, although this won’t install man pages.

If you downloaded the .deb version, then running dpkg -i splittar_0.2_all.deb should work

How do I use it?

An example is probably the best way to get started.

splittar -f outputfile.tar.gz -m CD $HOME

This will create files called outputfile-1.tar.gz, outputfile-2.tar.gz, etc., each at most 700 MB in size, from the files in $HOME.

See the man page for more information.

3 comments to splittar

  • Chris

    Thanks for writing splittar, but it would be more helpful if you’d include an online copy of the man page (at least the full syntax). The download package only comes with a man page in .sgml format, which apparently needs to be converted to regular man page format using a tool called docbook-to-man, which doesn’t come standard on my system and which would require me to install about 6 other unwanted tools in the process. No thanks. I’ll be trying out a competing script (tarlimit) first because at least I can figure out how to use it without an hour of legwork.

  • Chris

    Well, I went to the trouble of stripping the SGML tags from the man page and cleaning it up a bit. Hope this is helpful to someone else.

    create multiple tar files

    splittar

    -h|–help
    –version
    -f|–output outputfile
    -m|–maxsize maxsize
    -n|–numopen maxopen
    -r|–ratioweight weight
    -x|–dontapprox
    -z|–gzip
    -j|–bzip2
    -p|–plain
    -d|–debug
    -v|–verbose
    -q|–quiet
    –profile
    file

    DESCRIPTION

    splittar allows you to create one or more
    tar files from a set of data where each of the generated tar files is
    less than a specified maximum size.

    Each tar file is a proper, self-contained tar file. Other methods of
    backing up data to removable media requires a tar file to be split,
    making tar file n useless without files 1,…,n-1.

    OPTIONS

    -h|–help

    Outputs a brief usage message and exits

    –version

    Outputs splittar’s version and
    exits

    -f|–output outputfile

    You MUST specify either -f or
    –output to specify the name of the output tar
    file(s). Output to standard output is NOT supported since there
    is no simple way to indicate where breaks between files would be.

    The files created by splittar will be
    named according to this option, with a suffix appended to the
    portion before the filename extension indicating each file’s
    position in the sequence. See the EXAMPLES section below.

    -m|–maxsize maxsize

    Specify how large each file generated by
    splittar will be. The size can be
    specified in bytes (by default). You can also specify a
    number followed by one of the following suffixes:

    KB = 1024 bytes
    MB = 1024 KB
    GB = 1024 MB
    TB = 1024 GB

    One of the following units may also be used:

    CD = 700 MB
    CD650 = 650 MB
    DVD = 4699979766 bytes
    DVD3 = 1566572544 bytes

    DVD3 can be useful for creating files which fit onto
    a DVD. The maximum file size for an ISO9660 filesystem is
    around 2GB, so the value for DVD3 was chosen to be less
    than 2GB and allow 3 of these files to fit onto a single
    layer DVD nicely.
    The default value is CD.

    -n|–numopen maxopen

    How many tar files splittar should
    keep open simultaneously.
    Keeping multiple tar files open simultaneously can make the
    resulting files more consistent in size. When adding a file to
    the tar files, splittar tries to
    determine if the file will fit in one of the output files. If
    a file is too big for one output file, the next one is tried
    until either the file will fit, or until there are no more
    output files to try (in which case a new file is opened). By
    keeping more files open, files can be created that are as close
    as possible to the maximum size.
    One side effect of this option is that files may appear
    out of order in the resulting tar files. If this is a concern,
    set this option to 1.
    Setting this option to 0 means that
    splittar will keep an unlimited number
    of tar files open. The overhead per tar file is not that
    large, so this is the recommended setting.
    The default value is 0 (unlimited).

    -r|–ratioweight weight

    Sets the weight assigned to the estimated size of a file
    in the archive as calculated by multiplying the compression
    ratio to date by the file’s actual size.

    A file’s estimated size in the archive is used to determine if
    the file should be included in one of the open archives, or if
    a new archive should be started. The estimated size is a
    weighted average between the actual file size, and the file
    size multiplied by the current compression ratio. The ratio
    weight parameter controls how these values are combined. A
    value of 0.0 means that the compression ratio has no influence
    on the estimated size at all (the estimated file size equals
    the actual file size).
    The default value is 1.0 (assume the current file will
    compress exactly as well as all the previous files)

    -x|–dontapprox

    Set the ratio weight to 0.0

    -d|–debug

    Print out debug output

    -v|–verbose

    Print out verbose output (less than debug)

    -q|–quiet

    Print out hardly anything

    –profile

    Generates profile data in a file called ‘splittar.prof’
    in the current directory

    file …

    List of files and/or directories to archive.

    RETURN VALUE

    Returns 0 on success
    Returns 1 when some files could not be added because access was denied
    Returns 2 when an output file could not be created
    Returns 254 on an uncaught exception
    Returns 255 when the program was interrupted

    ERRORS

    Return codes, either exit status or errno settings.

    EXAMPLES

    splittar -f example.tgz -m CD /home

    will generate CD-sized files named example-1.tgz, example-2.tgz, etc.
    from the data contained in /home

    ENVIRONMENT

    Environment variables this program might care about.

    FILES

    All files used by the program. Typical usage is like this:

    /usr/man
    default man tree

    /usr/man/man*/*.*
    unformatted (nroff source) man pages

    NOTES

    splittar may generate files larger than the
    specified maximum in some cases. Currently it estimates how much a tar
    file will grow based on the current compression ratio. If the size of a
    candidate file multiplied by the current compression ratio added to the
    current size of the tar file would exceed the maximum size, then a new
    file is started. If the candidate file does not compress at least as
    well as previous files, then the resulting tar file may be too large.
    In addition, the resuling tar file may be too large by a few
    kilobytes due to buffering in underlying libraries.
    In practise these issues have not been a problem.
    Future versions of splittar will attempt
    to address these problems.

    CAVEATS

    At least one file will be included in each file. Depending on
    the compression used, this could mean that the resulting file could
    exceed the specified maximum.

    Future versions of splittar may address this by
    giving the option to split files that are too large.

    DIAGNOSTICS

    splittar will output warnings if any of the
    generated files exceed the maximum specified size.

    BUGS

    Things that are broken or just don’t work quite right.

    RESTRICTIONS

    Bugs you don’t plan to fix. :-)

    AUTHOR

    Chris AtLee

    HISTORY

    Programs derived from other sources sometimes have this.

    SEE ALSO

    tar(1)

  • Sabuj Pattanayek

    http://unix.stackexchange.com/questions/18628/generating-sets-of-files-that-fit-on-a-given-media-size-for-tar-t/19449#19449

    I’ll make some modifications to your script when I get some time to just spit out the list of files for input to tar -T

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">