Helix file types

Helix supports a set of file types that enable it to determine how files are stored by the Helix server and whether the file can be diffed. When you add a file, Helix attempts to determine the type of the file automatically: Helix first determines whether the file is a regular file or a symbolic link, and then examines the first part of the file to determine whether it’s text or binary. If any non-text characters are found, the file is assumed to be binary; otherwise, the file is assumed to be text. (Files in Unicode environments are detected differently; see Helix file type detection and Unicode.

To determine the type of a file under Helix control, issue the p4 opened or p4 files command. To change the Helix file type, specify the -tfiletype option. For details about changing file type, refer to the descriptions of p4 add, p4 edit, and p4 reopen in the P4 Command Reference.

Helix supports the following file types:

Keyword Description Comments Stored as

apple

Mac file

AppleSingle storage of Mac data fork, resource fork, file type and file creator.

For full details, please see the Mac client release notes.

full file, compressed, AppleSingle format

binary

Non-text file

Synced as binary files in the workspace. Stored compressed within the depot.

full file, compressed

resource

Mac resource fork

(Obsolete) This type is supported for backward compatibility, but the apple file type is recommended.

full file, compressed

symlink

Symbolic link

Helix applications on UNIX, OS X, recent versions of Windows treat these files as symbolic links. On other platforms, these files appear as (small) text files.

delta

text

Text file

Synced as text in the workspace. Line-ending translations are performed automatically.

delta

unicode

Unicode file

Helix servers operating in Unicode mode support the unicode file type. These files are translated into the local character set specified by P4CHARSET.

Helix servers not in Unicode mode do not support the unicode file type.

For details, see the Internationalization Notes.

delta, UTF-8

utf8

Unicode file

Whether the service is in Unicode mode or not, files that are detected as UTF8 will be stored as UTF8 and synced as UTF8 without being translated by the P4CHARSET setting.

For details, see the Internationalization Notes.

delta, UTF-8

utf16

Unicode file

Whether the service is in Unicode mode or not, files are transferred as UTF-8, and translated to UTF-16 (with byte order mark, in the byte order appropriate for the user’s computer) in the client workspace.

For details, see the Internationalization Notes.

delta, UTF-8

File type modifiers

You can apply file type modifiers to the base types of specific files to preserve timestamps, expand RCS keywords, specify how files are stored in the service, and more. For details about applying modifiers to file types, see Specifying how files are stored in Helix.

The following table lists the file type modifiers:

Modifier Description Comments

+C

Helix stores the full compressed version of each file revision

Default storage mechanism for binary files and newly-added text, unicode, or utf16 files larger than 10MB.

+D

Helix stores deltas in RCS format

Default storage mechanism for text files.

+F

Helix stores full file per revision

For large ASCII files that aren’t treated as text, such as PostScript files, where storing the deltas is not useful or efficient.

+k

RCS (Revision Control System) keyword expansion

Supported keywords are as follows:

$Id$

$Header$

$Date$

Date of submission

$DateUTC$

Date of submission in UTC time zone

$DateTime$

Date and time of submission

$DateTimeUTC$

Date and time of submission in UTC time zone.

$DateTimeTZ$

Date and time of submission in the server’s time zone, but including the actual time zone in the result.

$Change$

$File$

$Revision$

$Author$

RCS keywords are case-sensitive. A colon after the keyword (for example, $Id:$) is optional.

+ko

Limited keyword expansion

Expands only the $Id$ and $Header$ keywords. Primarily for backwards compatibility with versions of Helix prior to 2000.1, and corresponds to the +k (ktext) modifier in earlier versions of Helix.

+l

Exclusive open (locking)

If set, only one user at a time can open a file for editing.

Useful for binary file types (such as graphics) where merging of changes from multiple authors is not possible.

+m

Preserve original modification time

The file’s timestamp on the local file system is preserved upon submission and restored upon sync. Useful for third-party DLLs in Windows environments, because the operating system relies on the file’s timestamp. By default, the modification time is set to the time you synced the file.

+S

Only the head revision is stored

Older revisions are purged from the depot upon submission of new revisions. Useful for executable or .obj files.

+Sn

Only the most recent n revisions are stored, where n is a number from 1 to 10, or 16, 32, 64, 128, 256, or 512.

Older revisions are purged from the depot upon submission of more than n new revisions, or if you change an existing +Sn file’s n to a number less than its current value. For details, see the P4 Command Reference.

Using an +Sn file modifier results in special behavior when you delete and readd a file: no file reversions are deleted that were submitted before the add or delete. For example, if a file of type +S2 is marked as deleted in revision 5, and then re-added with the same file type and modifier, revisions 3 and 4 are not purged.

+w

File is always writable on client

Not recommended, because Helix manages the read-write settings on files under its control.

+x

Execute bit set on client

Used for executable files.

+X

Archive trigger required

The Helix server runs an archive trigger to access the file. See the Helix Versioning Engine Administrator Guide: Fundamentals for details.

Specifying how files are stored in Helix

File revisions of binary files are normally stored in full within the depot, but only changes made to text files since the previous revision are normally stored. This approach is called delta storage, and Helix uses RCS format to store its deltas. The file’s type determines whether full file or delta storage is used.

Some file types are compressed to gzip format when stored in the depot. The compression occurs when you submit the file, and decompression happens when you sync (copy the file from the depot to your workspace). The client workspace always contains the file as it was submitted.

Warning

To avoid inadvertent file truncation, do not store binary files as text. If you store a binary file as text from a Windows computer and the file contains the Windows end-of-file character ^Z, only the part of the file up to the ^Z is stored in the depot.

Assigning file types for Unicode files

The Helix server can be run in Unicode mode to activate support for filenames and Helix metadata that contain Unicode characters, or in non-Unicode mode, where filenames and metadata must be ASCII, but textual files containing Unicode content are still supported.

If you need to manage textual files that contain Unicode characters, but do not need Unicode characters in Helix metadata, you do not need to run Helix in Unicode mode. Assign the Helix utf16 file type to textual files that contain Unicode characters.

Your system administrator will be able to tell you which mode the service is using.

In either mode, Helix supports a set of file types that enable it to determine how a file is stored and whether the file can be diffed. The following sections describe the considerations for managing textual files in Unicode environments:

To assign file type when adding a file to the depot, specify the -t option. For example:

$ p4 add -t utf16 newfile.txt

To change the file type of files in the depot, open the file for edit, specifying the -t option. For example:

$ p4 edit -t utf16 myfile.txt

Choosing the file type

When assigning file types to textual files that contain Unicode, consider the following:

  • Do you need to edit and diff the files?

    Many IDEs create configuration files that you never edit manually or diff. To ensure they are never translated, assign such files the binary file type.

  • Is your site managing files that use different character sets?

    If so, consider storing them using a utf16 file type, to ensure they are not translated but still can be diffed.

Unicode mode services translate the contents of Unicode files into the character set specified by P4CHARSET. The following table provides more details about how Unicode-mode services manage the various types of text files:

Text file type Stored by Helix as (Unicode mode) Validated? Translated per P4CHARSET? Translated per client platform

text

Extended ASCII

No

No

No

unicode

UTF-8

Yes (as UTF-16 and P4CHARSET)

Yes

No

utf16

UTF-8

Yes (as UTF-16)

No

No

Non-Unicode-mode services do not translate or verify the contents of unicode files. Instead, the UTF-8 data is converted to UTF-16 using the byte order appropriate to the client platform. To ensure that such files are not corrupted when you edit them, save them as UTF-8 or UTF-16 from within your editing software.

Text file type Stored by Helix as (Unicode mode) Validated? Translated per P4CHARSET? Translated per client platform

text

Extended ASCII

No

No

No

unicode

UTF-8

Yes (as UTF-16 and P4CHARSET)

No

No

utf16

UTF-8

Yes (as UTF-16)

No

Yes

Helix file type detection and Unicode

In both Unicode mode and non-Unicode mode, if you do not assign a file type when you add a file to the depot, Helix (by default) attempts to detect file type by scanning the first 65536 characters of the file. If non-printable characters are detected, the file is assigned the binary file type. (In Unicode mode, a further check is performed: if there are no non-printable characters, and there are high-ASCII characters that are translatable using the character set specified by P4CHARSET, the file is assigned the unicode file type.)

Finally (for services running in Unicode mode or non-Unicode mode), if a UTF-16 BOM is present, the file is assigned the utf16 file type. Otherwise, the file is assigned the text file type. (In Unicode mode, a further check is performed: files with high-ASCII characters that are undefined in the character set specified by P4CHARSET are assigned the binary file type.)

In most cases, there is no need to override Helix’s default file type detection. If you must override Helix’s default file type detection, you can assign Helix file types according to a file’s extension, by issuing the p4 typemap command. For more about using the typemap feature, refer to the Helix Versioning Engine Administrator Guide: Fundamentals, and the P4 Command Reference.

Overriding file types

Some file formats (for example, Adobe PDF files, and Rich Text Format files) are actually binary files, but they can be mistakenly detected by Helix as being text. To prevent this problem, your system administrator can use the p4 typemap command to specify how such file types are stored. You can always override the file type specified in the typemap table by specifying the -t filetype option.

Preserving timestamps

Normally, Helix updates the timestamp when a file is synced. The modification time (+m) modifier is intended for developers who need to preserve a file’s original timestamp. This modifier enables you to ensure that the timestamp of a file synced to your client workspace is the time on your computer when the file was submitted.

Windows uses timestamps on third-party DLLs for versioning information (both within the development environment and also by the operating system), and the +m modifier enables you to preserve the original timestamps to prevent spurious version mismatches. The +m modifier overrides the client workspace [no]modtime setting (for the files to which it is applied). For details about this setting, refer to File type modifiers.

Expanding RCS keywords

RCS (Revision Control System), an early version control system, defined keywords that you can embed in your source files. These keywords are updated whenever a file is committed to the repository. Helix supports some RCS keywords.

To activate RCS keyword expansion for a file, use the +k modifier. RCS keywords are expanded as follows.

Keyword Expands To Example

$Author$

Helix user submitting the file

$Author: bruno $

$Change$

Helix changelist number under which file was submitted

$Change: 439 $

$Date$

Date of last submission in format YYYY/MM/DD

$Date: 2011/08/18 $

$DateTime$

Date and time of last submission in format YYYY/MM/DD hh:`mm:`ss

Date and time are as of the local time on the Helix server at time of submission.

$DateTime: 2011/08/18 23:17:02 $

$File$

Filename only, in depot syntax (without revision number)

$File: //depot/path/file.txt $

$Header$

Synonymous with $Id$

$Header: //depot/path/file.txt#3 $

$Id$

Filename and revision number in depot syntax

$Id: //depot/path/file.txt#3 $

$Revision$

Helix revision number

$Revision: #3 $

To display a file without expanding its keywords, use p4 print -k filename.