contents
PersBackup

Personal Backup Version 5.9

© 2001 − 2018, Dr. J. Rathlev

Special notes on the current version

From Version 5.0, Personal Backup has been created using an IDE which has full Unicode support (currently Delphi 10 Seattle). So the the ANSI (ISO-8859) filenames limitations on copying files are obsolete. In addition, path lengths may be longer than 260 characters.

The most important enhancements:

Compressing as gzip

Filenames

The existing standard for the gzip format (RFC1952 of 1996) calls for the filename to be stored in the file header using the ISO-8859-1 character set. I could not find any recommendations as to how to handle Unicode filenames.

The current Linux version of the program gzip used for creating and reading gz archives differs from the above standard and stores filenames in UTF-8 format. The OS byte in the header is set to 3 (Unix).

Until now, Personal Backup has set this byte to 0 (FAT) and saved the filename per ISO-8859-1. To be compatible with previous versions and also to support Unicode, two variants are used in Version 5:

Hence the problem arises that other programs (such as WinZip or WinRar) will correctly detect the stored filename only with the first variant. This has, however, no effect on the unpacking of the files.

It would of course be better to use the unused bit 5 of the FLG byte as a criteria for the coding of the filename. The current zip format does it in this way (see below).

Files > 4 GB

The existing standard for the gzip format (RFC1952 of 1996) reserves a 32 bit value for the length of the uncompressed file. For files > 4 GB this value is written module 232. But like many file archive programs (e.g. 7-zip), Personal Backup supports the use of an extra field with the signature 0x0100 containing the real file size.

Encrypted gzip files

For more details refer to the description below.

Creating zip archives

The current zip format specification version 6.3.2 from October 2007 defines how Unicode filenames are to be processed: if bit 11 of the "general purpose bit flag" is set, filename and comment are in UTF-8 coding. Personal Backup uses this convention. Most of the current file-compression programs now support this new format, too. Among those that do are WinZip Version 12, WinRar Version 3.80 and 7-Zip Version 9.20, whereas Windows Explorer does not yet support the new format even under Windows 7.

Encryption

Files are encrypted using the AES method, with the same routines as in WinZip (see Info at WinZip and at Brian Gladman). The file format created depends on the backup mode:

No compression (proprietary file format):
Data is written to file in the same way as is done with zip but without prior compression (more info). As this is a non standard format, a restore can only be made with the internal function of the program.
   Signature  : 4 Bytes: JREx  (since version 5.8.5)
   Enc-Header : 10, 14 or 18 bytes (depending on the encryption depth):
                Salt value (8, 12 or 16 bytes) + password verification value (2 bytes)
   Enc-Data   : Same number of bytes as original file
   Enc-Trailer: 10 bytes Authentication code
With compression (modified gzip format):
Data is first compressed using the gzip algorithm and encrypted using the AES method. The encrypted data has its own verification, so the gzip checksum is always set to 0. The gzip file format is specified in RFC1952. Bear in mind that this standard does not contain recommendations for encryption. The format described below just follows its specifications. For a restore, the internal function of the program is required.
Note: starting with Version 5.2 encrypted gz files will have the file extension gze.
   Gzip-Header : 10 bytes as default
                 new: Flag byte: bit 5 = encrypted
   Extra field : (ID=1) Optional for files > 4GB  - 20 bytes
   Extra field : (ID=$524A) Signature JR + specification of the encryption depth 
                 (since version 5.8.5) - 6 bytes
   Filename    : ISO-8859-1 (OS=0 - FAT) or UTF-8 (OS=11 - NTFS)
   Enc-Header  : 10, 14 or 18 bytes (depending on the encryption depth)
   Enc-Data    : Same number of bytes as compressed original file
   Enc-Trailer : 10 bytes  (see above)
   Gzip-Trailer: 8 bytes
                 CRC always = 0
Zip file (largely compatible with the zip standard):
A description of the zip format can be found at PkWare and notes on encryption at WinZip. Personal Backup creates archives with encrypted file data compatible with WinZip. For restore, any compression program supporting these specifications can be used.
Important note: Since version 5.9.4, Personal Backup offers the additional facility of encrypting also filenames. This makes the created zip archives no longer fully compatible with WinZip. If you wish to open such an archive using a program like WinZip or 7zip, the original directory structure will not be displayed. Instead, all files and directories are marked with a sequential hex number. The real name is encrypted and stored in an extra field within the local file header. To unpack this, the internal restore function or the add-on program PbRestore is required.
Amendment to the WinZip format:
    Local File Header / Central Directory Header:
       general purpose bit flag - Bit 8:	filenames are encrypted
       
    Extra Data Field for encrypted filenames
    ----------------------------------------
    Offset  Size  Contents  
    0       2     Header ID of extra field (0x9909)  
    2       2     Data size (n) in bytes (variable)
    4       n     Encrypted filename     
   

Passwords

All passwords for FTP, for SMTP and AES encryption must be coded per ISO-8859-1.

Length of file paths

For filenames (inc. path), the 260-character limitation still applies to all Windows versions at certain points. (more info). This affects all applications using the non-Unicode versions of the Windows API calls and under Windows XP also all applications using the Windows Shell components, such as Explorer. This limitation appears no longer to apply first of all under Windows 7.
Everywhere where Personal Backup refers to shell components (e.g. in a directory- or file-selection dialog), the path length limitation applies even for Version 5 except with Windows 7 and newer.
Internally, the program uses the path prefix "\\?\" for all file processing functions (e.g. when copying files) whereby a maximum length of about 32000 characters is allowed. With Version 5 it is therefore possible to backup, restore and delete files with paths exceeding the above limit, even when some other programs (including Windows XP Explorer) will fail on scanning such a directory tree. One file manager that supports long filenames is Total Commander Version 7.5 .


J. Rathlev, 24222 Schwentinental, Germany, April 2018