Character Set Support
Character set support is only enabled for Unicode-enabled Helix Servers. In this mode, P4Java differentiates between Helix Server file content character sets (that is, the encoding used to read or write a file’s contents) and the character sets used for Helix Server file names, job specs, changelist descriptions, and so on.
This distinction is made due to the way Java handles strings and basic
I/O: in general, while file content character set encodings need to be
preserved so that the end results written to or read from the local disk
are properly encoded, P4Java does not need to know about file metadata or
other string value encodings. Because Helix Server store and transmit all such metadata and strings in normalized
UTF-8 form, and because all Java strings are inherently encoded in
UTF-16, the encoding to and from non-UTF-16 character sets (such as
shiftjis
) is done externally from P4Java (usually by the
surrounding app), and is not influenced by or implemented in P4Java itself. This means that the character set passed to the
IOptionsServer.setCharsetName
method is only used for translation
of file content. Everything else, including all file names, job specs,
changelist descriptions, and so on, is encoded in the Java-native Java
string encoding UTF-16 (and may or may not need to be translated out of
that coding to something like shiftjis
or
winansi
).
P4Java supports file content operations on files encoded in most of the
character sets supported by the Helix Server, but not all. The list of supported
Helix Server
file content charsets is available to calling programs through the
PerforceCharsets.getKnownCharsets
method. If you attempt to
set a IOptionsServer
object’s charset to a charset not supported by
both the Helix Server and the local JDK installation, you will get an appropriate
exception; similarly, if you try to (for example) sync a file with an
unsupported character set encoding, you will also get an exception.
The Helix Server uses non-standard names for several standard character sets. P4Java also uses the Helix Server version of the character set, rather than the standard name.