Parchive

Parchive
Parchive
Filename extension .par, .par2, .par3, pa3, .p??
Type of format forward error correction

Parchive (a contraction of parity archive volume set) is an open source software project that emerged in 2001 to develop a parity file format, as conceived by Tobias Rieper and Stefan Wehlus. These parity files use a forward error correction-style system that can be used to perform data verification, and allow recovery when data is lost or corrupted.

The project is currently administered by Ryan Gallagher (binerman), Roger Harrison (kbalore), Willem Monsuwe (monsuwe), and Stefan Wehlus (wehlus).[1]

Contents

Overview

Parchive was written to solve the problem of reliably sending large files on Usenet.[2]

Usenet newsgroups were originally designed for informal conversations and the underlying protocol, NNTP was not designed to transmit arbitrary binary data. Another limitation, which was acceptable for conversations but not for files, was that messages were normally fairly short in length and limited to 7-bit ASCII text.[3]

Various techniques were devised to send files over Usenet, such as uuencoding and Base64. Later Usenet software allowed 8 bit Extended ASCII, which permitted new techniques like yEnc. Large files were broken up to reduce the effect of a corrupted download, but the unreliable nature of Usenet remained.

With the introduction of Parchive, parity files could be created that were then uploaded along with the original data files. If any of the data files were damaged or lost while being propagated between Usenet servers, users could download parity files and use them to reconstruct the damaged or missing files. Parchive included the construction of small index files (*.par in version 1 and *.par2 in version 2) that do not contain any recovery data. These indexes contain file hashes that can be used to quickly identify the target files and verify their integrity.

Because the index files were so small, they minimized the amount of extra data that had to be downloaded from Usenet to verify that the data files were all present and undamaged, or to determine how many parity volumes were required to repair any damage or reconstruct any missing files. They were most useful in version 1 where the parity volumes were much larger than the short index files. These larger parity volumes contain the actual recovery data along with a duplicate copy of the information in the index files (which allows them to be used on their own to verify the integrity of the data files if there is no small index file available).

History

In July 2001, Tobias Rieper and Stefan Wehlus proposed the Parity Volume Set specification, and with the assistance of other project members, version 1.0 of the specification was published in October 2001.[4] Par1 used Reed-Solomon error correction to create new recovery files. An end user could use any of the recovery files to rebuild a missing file from an incomplete download.

Version 1 became widely used on Usenet, but it did suffer some limitations:

  • It was restricted to handle at most 255 files.
  • The recovery files had to be the size of the largest input file, so it did not work well when the input files were of various sizes. (This limited its usefulness when not paired with the proprietary RAR compression tool.)
  • The recovery algorithm had a bug, due to a flaw[5] in the academic paper[6] on which it was based.
  • It was strongly tied to Usenet and it was felt that a more general tool might have a wider audience.

In January 2002, Howard Fukada proposed that a new PAR2 specification should be devised with the significant changes that data verification and repair should work on blocks of data rather than whole files, and that the algorithm should switch to using 16 bit numbers rather than the 8 bit numbers that PAR 1 used. Michael Nahas and Peter Clements took up these ideas in July 2002, with additional input from Paul Nettle and Ryan Gallagher (who both wrote Par1 clients). Version 2.0 of the Parchive specification was published by Michael Nahas in September 2002.[7]

Peter Clements then went on to write the first two PAR2 implementations: QuickPar and par2cmdline.

Versions

Versions 1 and 2 of the file format are incompatible. (However, many clients support both.)

Version 1

For version 1, given files f1, f2, ..., fn, the Parchive consists of an index file (f.par) and a number of "parity volumes" (f.p01, f.p02, etc.). Given all of the original files except for one (for example, f2), it is possible to create the missing f2 given all of the other original files and any one of the parity volumes. Alternatively, it is possible to recreate two missing files from any two of the parity volumes and so forth.[8]

Version 1 supports up to 256 recovery files. Each recovery file must be the size of the largest input file.

Version 2

Version 2 files generally use this naming/extension system: filename.vol000+01.PAR2, filename.vol001+02.PAR2, filename.vol003+04.PAR2, filename.vol007+06.PAR2, etc. The +01, +02, etc. in the filename indicates how many blocks it contains, and the vol000, vol001, vol003 etc. indicates the number of the first recovery block within the PAR2 file. If an index file of a download states that 4 blocks are missing, the easiest way to repair the files would be by downloading filename.vol003+04.PAR2. However, due to the redundancy, filename.vol007+06.PAR2 is also acceptable.

Version 2 supports up to 32768 (2^15) recovery blocks. Input files are split into multiple equal-sized blocks so that recovery files do not need to be the size of the largest input file.

There is no support for Unicode (it is planned for version 3).[9]

Directory support is provided in MultiPar's implementation of PAR2.

Version 3

Version 3 does not officially[10] exist yet,[11] but is planned to:[12][13]

  • fix problems related to creating or repairing when the block count or block size is very high.
  • directory support.
  • File moving and renaming support. [14]
  • Unicode support.

An application written for PAR2 will not be able to understand PAR3 files.

Software

  • Windows
  • MultiPar [1] - Builds upon QuickPar's features and GUI, with support for PAR3, multithreading, multiple processors, and the ability to recurse subfolders, GPL.
  • QuickPar - freeware, unmaintained since 2004, superseded by MultiPar.
  • par2+tbb (a concurrent (multithreaded) version of par2cmdline 0.4, GPLv2 (or later))
  • Par-N-Rar (GPL)
  • phpar2 - advanced par2cmdline with multithreading and highly optimized assemblercode (about 66% faster than QuickPar 0.9.1)
  • Rarslave (GPLv2)
  • SmartPAR (no support for PAR2)
  • Mac OS X
  • par2+tbb (a concurrent (multithreaded) version of par2cmdline 0.4, GPLv2 (or later))

See also

References

  1. ^ "Parchive: Parity Archive Tool: contacts". http://parchive.sourceforge.net/#contacts. Retrieved 2009-10-29. 
  2. ^ "Parchive: Parity Archive Volume Set". http://parchive.sourceforge.net/#desc. Retrieved 2009-10-29. "The original idea behind this project was to provide a tool to apply the data-recovery capability concepts of RAID-like systems to the posting and recovery of multi-part archives on Usenet." 
  3. ^ Kantor, Brian; Lapsley, Phil (February 1986). "Character Codes". Network News Transfer Protocol. IETF. p. 5. sec. 2.2. RFC 977. http://tools.ietf.org/html/rfc977#section-2.2. Retrieved 2009-10-29. 
  4. ^ Nahas, Michael (2001-10-14). "Parchive: Parity Volume Set specification 1.0". http://sourceforge.net/docman/display_doc.php?docid=7273&group_id=30568. Retrieved 2009-04-07. [dead link]
  5. ^ Plank, James S.; Ding, Ying (April 2003). "Note: Correction to the 1997 Tutorial on Reed-Solomon Coding". http://www.cs.utk.edu/~plank/plank/papers/CS-03-504.html. Retrieved 2009-10-29. 
  6. ^ Plank, James S. (September 1997). "A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems". http://www.cs.utk.edu/~plank/plank/papers/SPE-9-97.html. Retrieved 2009-10-29. 
  7. ^ Nahas, Michael; Clements, Peter; Nettle, Paul; Gallagher, Ryan (2003-05-11). "Parity Volume Set Specification 2.0". http://parchive.sourceforge.net/docs/specifications/parity-volume-spec/article-spec.html. Retrieved 2009-10-29. 
  8. ^ Wang, Wallace (2004-10-25). "Finding movies (or TV shows): Recovering missing RAR files with PAR and PAR2 files". Steal this File Sharing Book (1st ed.). San Francisco, California: No Starch Press. pp. 164 – 167. ISBN 1-59327-050-X. http://books.google.com/books?id=FGfMS5kymmcC&pg=PT183. Retrieved 2009-09-24. 
  9. ^ http://www.quickpar.co.uk/forum/viewtopic.php?id=1065 QuickPar forum posting
  10. ^ http://hp.vector.co.jp/authors/VA021385/ Beta release from MultiPar with PAR3 beta functionality
  11. ^ http://www.quickpar.org.uk/forum/viewtopic.php?id=1264 QuickPar forum posting - status PAR3
  12. ^ http://www.quickpar.co.uk/forum/viewtopic.php?id=1047 QuickPar forum posting - PAR3 specifications
  13. ^ http://hp.vector.co.jp/authors/VA021385/par3_spec_prop.htm PAR3 proposal
  14. ^ http://www.livebusinesschat.com/smf/index.php?topic=4751.0 PAR3 move/rename brainstorming

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Parchive — est un système correcteur d erreurs qui peut être appliqué à un ensemble de fichiers pour permettre leur reconstruction lorsqu un ou plusieurs de ces fichiers sont manquants, incomplets ou endommagés. Sommaire 1 Historique 2 Vue d ensemble 3… …   Wikipédia en Français

  • Fichier par — Parchive Parchive est un système correcteur d erreurs qui peut être appliqué à un ensemble de fichiers pour permettre leur reconstruction lorsqu un ou plusieurs de ces fichiers sont manquants, incomplets ou endommagés. Sommaire 1 Historique 2 Vue …   Wikipédia en Français

  • Par2 — Parchive Parchive est un système correcteur d erreurs qui peut être appliqué à un ensemble de fichiers pour permettre leur reconstruction lorsqu un ou plusieurs de ces fichiers sont manquants, incomplets ou endommagés. Sommaire 1 Historique 2 Vue …   Wikipédia en Français

  • SmartPAR — is a freeware file archiver application for Parchive file format for Microsoft Windows environments. The version of the Parchive format (PAR or PAR2) supported is not specified.The [http://ice.prohosting.com/smartpar/ author s last known website] …   Wikipedia

  • PAR1 — PAR1/PAR2 sind Checksummenfomate für Dateien. Für die Prüfung wird der Reed Solomon Code benutzt, der beim Lesen oder Empfangen der digitalen Daten Fehler erkennen und korrigieren kann. Die Dateien mit Prüfsummen können aus einer oder mehreren… …   Deutsch Wikipedia

  • Parity file — Parity files are files that are created to accompany data files, and are used to preserve data integrity and assist in data recovery. They are useful when data files are transmitted or stored on less than perfect mediums such as newsgroup… …   Wikipedia

  • PAR2 — PAR1/PAR2 sind Checksummenformate für Dateien. Für die Prüfung wird der Reed Solomon Code benutzt, der beim Lesen oder Empfangen der digitalen Daten Fehler erkennen und korrigieren kann. Die Dateien mit Prüfsummen können aus einer oder mehreren… …   Deutsch Wikipedia

  • USENET — Юзнет (англ. Usenet сокр. от User Network) компьютерная сеть, используемая для общения и публикации файлов. Usenet состоит из ньюсгрупп, в которые пользователи могут посылать сообщения. Сообщения хранятся на серверах, которые обмениваются ими… …   Википедия

  • Новостной клиент — Юзнет (англ. Usenet сокр. от User Network) компьютерная сеть, используемая для общения и публикации файлов. Usenet состоит из ньюсгрупп, в которые пользователи могут посылать сообщения. Сообщения хранятся на серверах, которые обмениваются ими… …   Википедия

  • Юзенет — Юзнет (англ. Usenet сокр. от User Network) компьютерная сеть, используемая для общения и публикации файлов. Usenet состоит из ньюсгрупп, в которые пользователи могут посылать сообщения. Сообщения хранятся на серверах, которые обмениваются ими… …   Википедия

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”