The naming zoo
You queue a release from a file host. It arrives as seventeen files with names that somebody, at some point in the last thirty years, thought were a good idea. You open the download folder and stare at this:
showname.s01e04.part01.rar showname.s01e04.part02.rar showname.s01e04.part03.rar bigfile.rar bigfile.r00 bigfile.r01 bigfile.r02 release.7z.001 release.7z.002 release.7z.003 archive.zip archive.z01 archive.z02 movie.001 movie.002 movie.003 backup.tar.gz.aa backup.tar.gz.ab backup.tar.gz.ac
Six different conventions for "this is one file, split into pieces." Nothing in the filenames tells a normal user which piece is the first one or how many pieces there should be. The RAR-lettered layout is the meanest of them: the first part is the one without a number. I had to look that up the first time I hit it, years ago, and I still see people get tripped up by it.
Veloxar's MultiPartArchiveDetector knows about all
six. You drop files into the download folder; it groups them by
base name, figures out the scheme, sorts the parts, and hands
the first one to the extractor. You never type a part number.
Why one detector, six patterns
The cheap version of this is a switch statement on extension.
That falls apart almost immediately. .001 could be
an HJSplit part or a 7-Zip part depending on what sits before it.
A filename like backup.tar.gz.001 has three dots and
no signature telling you whether the outer wrapper is HJSplit or
7z. And movie.release.2026.001 has four
dots, with an "extension" that is just a number.
So the detector runs five anchored regexes against each filename and stops at the first match. The patterns, lifted straight from the source:
(.rarMulti, #"^(.+)\.part(\d+)\.rar$"#) // part01.rar
(.rarMultiOld, #"^(.+)\.r(\d{2,3})$"#) // .r00
(.sevenZipParts, #"^(.+)\.7z\.(\d{3})$"#) // .7z.001
(.zipSplit, #"^(.+)\.z(\d{2})$"#) // .z01
(.hjSplit, #"^(.+)\.(\d{3})$"#) // .001
// sixth: Unix split via .aa/.ab suffixes
Ordering matters. .7z.001 has to be checked before
the bare HJSplit .001, otherwise every 7-Zip
multi-volume gets misclassified as HJSplit and the base name
ends up truncated. The RAR-lettered case has its own wrinkle.
bigfile.rar is only a "first part" if there also
happens to be a bigfile.r00 sitting next to it; a
lone .rar is just a plain archive. The detector
handles that in a second pass after the main scan, because you
can only know "this is the first part of a set" once you have
seen the rest of the set.
Missing parts are a failure, not a surprise
The worst time to find out you are missing .part07.rar
is forty-five minutes into extraction, when unrar
reports a CRC error, rolls the whole thing back, and leaves you
scrolling your history for the link that never finished. Multi-part
archives fail late by default. Finding out early is something
you have to do on purpose.
The detector does that work. For each group of parts it takes the
minimum and maximum indices it found, walks the integer range
between them, and flags any index that has no file. The result
carries a missingParts array; isComplete
is just missingParts.isEmpty. If the gap list is
non-empty at the moment you try to extract, you see a message
naming the exact missing indices, and the extraction never runs.
No CRC mystery at the forty-five minute mark.
This check happens before extraction starts,
not during. The pre-flight catches the cases the tool itself
would only discover halfway through: a .part07.rar
that never finished downloading, a .z01 that went
to the wrong folder, a 7-Zip volume where someone deleted
.002 thinking it was a duplicate of .001.
Two extractions at once, not a hundred
Veloxar supports sixteen archive formats end to end: ZIP, RAR,
7Z, TAR, ISO, DMG, CAB, ARJ, LZH, ACE, plus the compound shapes
TAR loves to wear (tar.gz, tar.bz2,
tar.xz, tgz, tbz2). The
extractor dispatches on format. ZIP goes through Foundation's
built-in FileManager.unzipItem. RAR and 7Z shell
out to unrar and 7z. TAR calls
/usr/bin/tar. DMG and ISO go through hdiutil
mount, a copy pass, then a detach.
Which means extraction is a mix of fast in-process work and slow subprocess work that pins a CPU core and hammers the disk. If you drop a hundred archives onto the app at once and let all of them extract in parallel, you get a machine that sounds like a jet engine and finishes slower than if you had done them one at a time.
ArchiveExtractor caps the active extractions at two.
The rest queue up behind a structured task group. Two is the
number that keeps the disk busy without making the system
unresponsive, and it matches how the app thinks about download
concurrency: generous with I/O, conservative with CPU-bound work.
Each extraction gets a 300-second timeout and up to three
retries, so a single flaky archive cannot wedge the whole queue.
Passwords, and why they go through stdin
Some archives are encrypted. RAR and 7Z both accept passwords on
the command line with a flag like -p<password>.
Do not do this. The command line of every running process is
visible to any other user on the machine via ps, and
a password that appears in argv appears in shell
history, in process accounting logs, and sometimes in crash
reports. Putting a secret in argv is putting it on a
billboard.
Veloxar passes archive passwords to unrar and
7z through stdin. The external tool reads the
password from its standard input rather than from its arguments,
and the byte sequence never appears in any process listing. Small
detail, but it is also the kind of thing that, when you get it
wrong, ends up in a security writeup with your product's name in
the title.
In the folder
You queue a multi-part release. Seventeen files land in the download folder over the next ten minutes. Veloxar groups them, figures out they are three separate archive sets (one RAR-numbered, one 7-Zip, one stray ZIP from the extras folder), and starts extracting the first two in parallel. The third waits its turn.
If something is missing, say the host's .part09.rar
404'd, you find out before the extractor spends twenty minutes
making progress it is going to throw away. You never type a part
number, never pick a "first file," never right-click and go
hunting for the correct context-menu entry. The auto-cleanup
option can delete the archives once extraction succeeds, but it
is off by default. People are surprisingly attached to their
.rar files, and I would rather ask than surprise
anyone.
The archive is an implementation detail of how the file moved across the network. Once it is on your disk, you should get the thing inside it, not a homework assignment about split conventions from 1998.