splittar
What is it?
splittar is a small utility written in Python. It will create multiple tar files from a set of data with each tar file being limited in size. This is useful for archiving data onto removable media such as CD or DVD. Other solutions that I’ve found rely on splitting one giant tar file, rendering all but the first tar files useless on its own. With splittar each file that is created is a valid tar file that is useful on its own.
Where do I get it?
Download it here:
You’ll also need Python version 2.4 installed.
How do I install it?
If you downloaded the .tar.gz version, running setup.py install should do the trick, although this won’t install man pages.
If you downloaded the .deb version, then running dpkg -i splittar_0.2_all.deb should work
How do I use it?
An example is probably the best way to get started.
splittar -f outputfile.tar.gz -m CD $HOME
This will create files called outputfile-1.tar.gz, outputfile-2.tar.gz, etc., each at most 700 MB in size, from the files in $HOME.
See the man page for more information.


Thanks for writing splittar, but it would be more helpful if you’d include an online copy of the man page (at least the full syntax). The download package only comes with a man page in .sgml format, which apparently needs to be converted to regular man page format using a tool called docbook-to-man, which doesn’t come standard on my system and which would require me to install about 6 other unwanted tools in the process. No thanks. I’ll be trying out a competing script (tarlimit) first because at least I can figure out how to use it without an hour of legwork.
Well, I went to the trouble of stripping the SGML tags from the man page and cleaning it up a bit. Hope this is helpful to someone else.
create multiple tar files
splittar
-h|–help
–version
-f|–output outputfile
-m|–maxsize maxsize
-n|–numopen maxopen
-r|–ratioweight weight
-x|–dontapprox
-z|–gzip
-j|–bzip2
-p|–plain
-d|–debug
-v|–verbose
-q|–quiet
–profile
file
DESCRIPTION
splittar allows you to create one or more
tar files from a set of data where each of the generated tar files is
less than a specified maximum size.
Each tar file is a proper, self-contained tar file. Other methods of
backing up data to removable media requires a tar file to be split,
making tar file n useless without files 1,…,n-1.
OPTIONS
-h|–help
Outputs a brief usage message and exits
–version
Outputs splittar’s version and
exits
-f|–output outputfile
You MUST specify either -f or
–output to specify the name of the output tar
file(s). Output to standard output is NOT supported since there
is no simple way to indicate where breaks between files would be.
The files created by splittar will be
named according to this option, with a suffix appended to the
portion before the filename extension indicating each file’s
position in the sequence. See the EXAMPLES section below.
-m|–maxsize maxsize
Specify how large each file generated by
splittar will be. The size can be
specified in bytes (by default). You can also specify a
number followed by one of the following suffixes:
KB = 1024 bytes
MB = 1024 KB
GB = 1024 MB
TB = 1024 GB
One of the following units may also be used:
CD = 700 MB
CD650 = 650 MB
DVD = 4699979766 bytes
DVD3 = 1566572544 bytes
DVD3 can be useful for creating files which fit onto
a DVD. The maximum file size for an ISO9660 filesystem is
around 2GB, so the value for DVD3 was chosen to be less
than 2GB and allow 3 of these files to fit onto a single
layer DVD nicely.
The default value is CD.
-n|–numopen maxopen
How many tar files splittar should
keep open simultaneously.
Keeping multiple tar files open simultaneously can make the
resulting files more consistent in size. When adding a file to
the tar files, splittar tries to
determine if the file will fit in one of the output files. If
a file is too big for one output file, the next one is tried
until either the file will fit, or until there are no more
output files to try (in which case a new file is opened). By
keeping more files open, files can be created that are as close
as possible to the maximum size.
One side effect of this option is that files may appear
out of order in the resulting tar files. If this is a concern,
set this option to 1.
Setting this option to 0 means that
splittar will keep an unlimited number
of tar files open. The overhead per tar file is not that
large, so this is the recommended setting.
The default value is 0 (unlimited).
-r|–ratioweight weight
Sets the weight assigned to the estimated size of a file
in the archive as calculated by multiplying the compression
ratio to date by the file’s actual size.
A file’s estimated size in the archive is used to determine if
the file should be included in one of the open archives, or if
a new archive should be started. The estimated size is a
weighted average between the actual file size, and the file
size multiplied by the current compression ratio. The ratio
weight parameter controls how these values are combined. A
value of 0.0 means that the compression ratio has no influence
on the estimated size at all (the estimated file size equals
the actual file size).
The default value is 1.0 (assume the current file will
compress exactly as well as all the previous files)
-x|–dontapprox
Set the ratio weight to 0.0
-d|–debug
Print out debug output
-v|–verbose
Print out verbose output (less than debug)
-q|–quiet
Print out hardly anything
–profile
Generates profile data in a file called ’splittar.prof’
in the current directory
file …
List of files and/or directories to archive.
RETURN VALUE
Returns 0 on success
Returns 1 when some files could not be added because access was denied
Returns 2 when an output file could not be created
Returns 254 on an uncaught exception
Returns 255 when the program was interrupted
ERRORS
Return codes, either exit status or errno settings.
EXAMPLES
splittar -f example.tgz -m CD /home
will generate CD-sized files named example-1.tgz, example-2.tgz, etc.
from the data contained in /home
ENVIRONMENT
Environment variables this program might care about.
FILES
All files used by the program. Typical usage is like this:
/usr/man
default man tree
/usr/man/man*/*.*
unformatted (nroff source) man pages
NOTES
splittar may generate files larger than the
specified maximum in some cases. Currently it estimates how much a tar
file will grow based on the current compression ratio. If the size of a
candidate file multiplied by the current compression ratio added to the
current size of the tar file would exceed the maximum size, then a new
file is started. If the candidate file does not compress at least as
well as previous files, then the resulting tar file may be too large.
In addition, the resuling tar file may be too large by a few
kilobytes due to buffering in underlying libraries.
In practise these issues have not been a problem.
Future versions of splittar will attempt
to address these problems.
CAVEATS
At least one file will be included in each file. Depending on
the compression used, this could mean that the resulting file could
exceed the specified maximum.
Future versions of splittar may address this by
giving the option to split files that are too large.
DIAGNOSTICS
splittar will output warnings if any of the
generated files exceed the maximum specified size.
BUGS
Things that are broken or just don’t work quite right.
RESTRICTIONS
Bugs you don’t plan to fix.
AUTHOR
Chris AtLee
HISTORY
Programs derived from other sources sometimes have this.
SEE ALSO
tar(1)