Cascadia Programming Logo

Cascadia Programming

Cartographer Logo

Documentation

Technical Formatting Syntax

NOTICE

The following formatting codes and examples are for reference only and normally not needed unless there is some "very" strange data that needs to be formatted. Normally everything can be done from the Cartographer user interface using the standard set or functions and formatting options without the need to specify your own custom format. This is a in progress document...

Number/Currency Formatting

Symbol Location Localized? Meaning
0 Number Yes Digit
# Number Yes Digit, zero shows as absent
. Number Yes Decimal separator or monetary decimal separator
- Number Yes Minus sign
, Number Yes Grouping separator
E Number Yes Separates mantissa and exponent in scientific notation. Need not be quoted in prefix or suffix.
; Subpattern boundary Yes Separates positive and negative subpatterns
% Prefix or suffix Yes Multiply by 100 and show as percentage
\u2030 Prefix or suffix Yes Multiply by 1000 and show as per mille value
¤ (\u00A4) Prefix or suffix No Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.
' Prefix or suffix No Used to quote special characters in a prefix or suffix, for example, "'#'#" formats 123 to "#123". To create a single quote itself, use two in a row: "# o''clock".

Examples

Format "0.00######"; The # means a digit should be displayed there except for trailing zeros. ... The number of decimal places in the formatted string will not exceed the total number of 0 s and # s after the dot, so in this example the digits after the 8th decimal place will be truncated.

Number Format Pattern Syntax

You can design your own format patterns for numbers by following the rules specified by the following BNF diagram:

pattern    := subpattern{;subpattern}
subpattern := {prefix}integer{.fraction}{suffix}
prefix     := '\\u0000'..'\\uFFFD' - specialCharacters
suffix     := '\\u0000'..'\\uFFFD' - specialCharacters
integer    := '#'* '0'* '0'
fraction   := '0'* '#'*

The notation used in the preceding diagram is explained in the following table:

Notation Description
X* 0 or more instances of X
(X | Y) either X or Y
X..Y any character from X up to Y, inclusive
S - T characters in S, except those in T
{X} X is optional

In the preceding BNF diagram, the first subpattern specifies the format for positive numbers. The second subpattern, which is optional, specifies the format for negative numbers.

Although not noted in the BNF diagram, a comma may appear within the integer portion.

Within the subpatterns, you specify formatting with special symbols. These symbols are described in the following table:

Symbol Description
0 a digit
# a digit, zero shows as absent
. placeholder for decimal separator
, placeholder for grouping separator
E separates mantissa and exponent for exponential formats
; separates formats
- default negative prefix
% multiply by 100 and show as percentage
? multiply by 1000 and show as per mille
¤ currency sign; replaced by currency symbol; if doubled, replaced by international currency symbol; if present in a pattern, the monetary decimal separator is used instead of the decimal separator
X any other characters can be used in the prefix or suffix
' used to quote special characters in a prefix or suffix

The output for the preceding lines of code is described in the following table. The value is the number, a double , that is to be formatted. The pattern is the String that specifies the formatting properties. The output, which is a String, represents the formatted number.

Output from DecimalFormatDemo Program
value pattern output Explanation
123456.789 ###,###.### 123,456.789 The pound sign (#) denotes a digit, the comma is a placeholder for the grouping separator, and the period is a placeholder for the decimal separator.
123456.789 ###.## 123456.79 The value has three digits to the right of the decimal point, but the pattern has only two. The format method handles this by rounding up.
123.78 000000.000 000123.780 The pattern specifies leading and trailing zeros, because the 0 character is used instead of the pound sign (#).
12345.67 $###,###.### $12,345.67 The first character in the pattern is the dollar sign ($). Note that it immediately precedes the leftmost digit in the formatted output.
12345.67 \u00A5###,###.### ¥12,345.67 The pattern specifies the currency sign for Japanese yen (¥) with the Unicode value 00A5.

Examples using STRING field type

System.out.format("%d%n", n);      //  -->  "461012"
System.out.format("%08d%n", n);    //  -->  "00461012"
System.out.format("%+8d%n", n);    //  -->  " +461012"
System.out.format("%,8d%n", n);    // -->  " 461,012"
System.out.format("%+,8d%n%n", n); //  -->  "+461,012"
      
System.out.format("%f%n", pi);       // -->  "3.141593"
System.out.format("%.3f%n", pi);     // -->  "3.142"
System.out.format("%10.3f%n", pi);   // -->  "     3.142"
System.out.format("%-10.3f%n", pi);  // -->  "3.142"
System.out.format(Locale.FRANCE, "%-10.4f%n%n", pi); // -->  "3,1416"

System.out.format("%tB %te, %tY%n", c, c, c); // -->  "May 29, 2006"
System.out.format("%tl:%tM %tp%n", c, c, c);  // -->  "2:34 am"
System.out.format("%tD%n", c);    // -->  "05/29/06"

Date/Time Formatting

Letter Date or Time Component Presentation Examples
G Era designator Text AD
y Year Year 1996; 96
Y Week year Year 2009; 09
M Month in year (context sensitive) Month July; Jul; 07
L Month in year (standalone form) Month July; Jul; 07
w Week in year Number 27
W Week in month Number 2
D Day in year Number 189
d Day in month Number 10
F Day of week in month Number 2
E Day name in week Text Tuesday; Tue
u Day number of week (1 = Monday, ..., 7 = Sunday) Number 1
a Am/pm marker Text PM
H Hour in day (0-23) Number 0
k Hour in day (1-24) Number 24
K Hour in am/pm (0-11) Number 0
h Hour in am/pm (1-12) Number 12
m Minute in hour Number 30
s Second in minute Number 55
S Millisecond Number 978
z Time zone General time zone Pacific Standard Time; PST; GMT-08:00
Z Time zone RFC 822 time zone -0800
X Time zone ISO 8601 time zone -08; -0800; -08:00

Examples

The following examples show how date and time patterns are interpreted in the U.S. locale. The given date and time are 2001-07-04 12:08:56 local time in the U.S. Pacific Time time zone.

Date and Time Pattern Result
"yyyy.MM.dd G 'at' HH:mm:ss z" 2001.07.04 AD at 12:08:56 PDT
"EEE, MMM d, ''yy" Wed, Jul 4, '01
"h:mm a" 12:08 PM
"hh 'o''clock' a, zzzz" 12 o'clock PM, Pacific Daylight Time
"K:mm a, z" 0:08 PM, PDT
"yyyyy.MMMMM.dd GGG hh:mm aaa" 02001.July.04 AD 12:08 PM
"EEE, d MMM yyyy HH:mm:ss Z" Wed, 4 Jul 2001 12:08:56 -0700
"yyMMddHHmmssZ" 010704120856-0700
"yyyy-MM-dd'T'HH:mm:ss.SSSZ" 2001-07-04T12:08:56.235-0700
"yyyy-MM-dd'T'HH:mm:ss.SSSXXX" 2001-07-04T12:08:56.235-07:00
"YYYY-'W'ww-u" 2001-W27-3

Format String Syntax

  • The format specifiers for general, character, and numeric types have the following syntax:
       %[argument_index$][flags][width][.precision]conversion
     

    The optional argument_index is a decimal integer indicating the position of the argument in the argument list. The first argument is referenced by "1$", the second by "2$", etc.

    The optional flags is a set of characters that modify the output format. The set of valid flags depends on the conversion.

    The optional width is a positive decimal integer indicating the minimum number of characters to be written to the output.

    The optional precision is a non-negative decimal integer usually used to restrict the number of characters. The specific behavior depends on the conversion.

    The required conversion is a character indicating how the argument should be formatted. The set of valid conversions for a given argument depends on the argument's data type.

  • The format specifiers for types which are used to represents dates and times have the following syntax:
       %[argument_index$][flags][width]conversion
     

    The optional argument_index, flags and width are defined as above.

    The required conversion is a two character sequence. The first character is 't' or 'T'. The second character indicates the format to be used. These characters are similar to but not completely identical to those defined by GNU date and POSIX strftime(3c).

  • The format specifiers which do not correspond to arguments have the following syntax:
       %[flags][width]conversion
     

    The optional flags and width is defined as above.

    The required conversion is a character indicating content to be inserted in the output.

    Conversions

    Conversions are divided into the following categories:

    1. General - may be applied to any argument type
    2. Character - may be applied to basic types which represent Unicode characters: char, Character, byte, Byte, short, and Short.
    3. Numeric
      1. Integral - may be applied to Java integral types: byte, Byte, short, Short, int and Integer, long, Long, and BigInteger (but not char or Character)
      2. Floating Point - may be applied to Java floating-point types: float, Float, double, Double, and BigDecimal
    4. Date/Time - may be applied to Java types which are capable of encoding a date or time: long, Long, Calendar, Date and TemporalAccessor
    5. Percent - produces a literal '%' ('\u0025')
    6. Line Separator - produces the platform-specific line separator

    The following table summarizes the supported conversions. Conversions denoted by an upper-case character (i.e. 'B', 'H', 'S', 'C', 'X', 'E', 'G', 'A', and 'T') are the same as those for the corresponding lower-case conversion characters except that the result is converted to upper case according to the rules of the prevailing Locale. The result is equivalent to the following invocation of String.toUpperCase()

        out.toUpperCase() 
    Conversion Argument Category Description
    'b', 'B' general If the argument arg is null, then the result is "false". If arg is a boolean or Boolean, then the result is the string returned by String.valueOf(arg). Otherwise, the result is "true".
    'h', 'H' general If the argument arg is null, then the result is "null". Otherwise, the result is obtained by invoking Integer.toHexString(arg.hashCode()).
    's', 'S' general If the argument arg is null, then the result is "null". If arg implements Formattable, then arg.formatTo is invoked. Otherwise, the result is obtained by invoking arg.toString().
    'c', 'C' character The result is a Unicode character
    'd' integral The result is formatted as a decimal integer
    'o' integral The result is formatted as an octal integer
    'x', 'X' integral The result is formatted as a hexadecimal integer
    'e', 'E' floating point The result is formatted as a decimal number in computerized scientific notation
    'f' floating point The result is formatted as a decimal number
    'g', 'G' floating point The result is formatted using computerized scientific notation or decimal format, depending on the precision and the value after rounding.
    'a', 'A' floating point The result is formatted as a hexadecimal floating-point number with a significand and an exponent. This conversion is not supported for the BigDecimal type despite the latter's being in the floating point argument category.
    't', 'T' date/time Prefix for date and time conversion characters. See Date/Time Conversions.
    '%' percent The result is a literal '%' ('\u0025')
    'n' line separator The result is the platform-specific line separator

    Any characters not explicitly defined as conversions are illegal and are reserved for future extensions.

    Date/Time Conversions

    The following date and time conversion suffix characters are defined for the 't' and 'T' conversions. The types are similar to but not completely identical to those defined by GNU date and POSIX strftime(3c). Additional conversion types are provided to access Java-specific functionality (e.g. 'L' for milliseconds within the second).

    The following conversion characters are used for formatting times:

    'H' Hour of the day for the 24-hour clock, formatted as two digits with a leading zero as necessary i.e. 00 - 23.
    'I' Hour for the 12-hour clock, formatted as two digits with a leading zero as necessary, i.e. 01 - 12.
    'k' Hour of the day for the 24-hour clock, i.e. 0 - 23.
    'l' Hour for the 12-hour clock, i.e. 1 - 12.
    'M' Minute within the hour formatted as two digits with a leading zero as necessary, i.e. 00 - 59.
    'S' Seconds within the minute, formatted as two digits with a leading zero as necessary, i.e. 00 - 60 ("60" is a special value required to support leap seconds).
    'L' Millisecond within the second formatted as three digits with leading zeros as necessary, i.e. 000 - 999.
    'N' Nanosecond within the second, formatted as nine digits with leading zeros as necessary, i.e. 000000000 - 999999999.
    'p' Locale-specific morning or afternoon marker in lower case, e.g."am" or "pm". Use of the conversion prefix 'T' forces this output to upper case.
    'z' RFC 822 style numeric time zone offset from GMT, e.g. -0800. This value will be adjusted as necessary for Daylight Saving Time. For long, Long, and Date the time zone used is the default time zone for this instance of the Java virtual machine.
    'Z' A string representing the abbreviation for the time zone. This value will be adjusted as necessary for Daylight Saving Time. For long, Long, and Date the time zone used is the default time zone for this instance of the Java virtual machine. The Formatter's locale will supersede the locale of the argument (if any).
    's' Seconds since the beginning of the epoch starting at 1 January 1970 00:00:00 UTC, i.e. Long.MIN_VALUE/1000 to Long.MAX_VALUE/1000.
    'Q' Milliseconds since the beginning of the epoch starting at 1 January 1970 00:00:00 UTC, i.e. Long.MIN_VALUE to Long.MAX_VALUE.

    The following conversion characters are used for formatting dates:

    'B' Locale-specific full month name, e.g. "January", "February".
    'b' Locale-specific abbreviated month name, e.g. "Jan", "Feb".
    'h' Same as 'b'.
    'A' Locale-specific full name of the day of the week, e.g. "Sunday", "Monday"
    'a' Locale-specific short name of the day of the week, e.g. "Sun", "Mon"
    'C' Four-digit year divided by 100, formatted as two digits with leading zero as necessary, i.e. 00 - 99
    'Y' Year, formatted as at least four digits with leading zeros as necessary, e.g. 0092 equals 92 CE for the Gregorian calendar.
    'y' Last two digits of the year, formatted with leading zeros as necessary, i.e. 00 - 99.
    'j' Day of year, formatted as three digits with leading zeros as necessary, e.g. 001 - 366 for the Gregorian calendar.
    'm' Month, formatted as two digits with leading zeros as necessary, i.e. 01 - 13.
    'd' Day of month, formatted as two digits with leading zeros as necessary, i.e. 01 - 31
    'e' Day of month, formatted as two digits, i.e. 1 - 31.

    The following conversion characters are used for formatting common date/time compositions.

    'R' Time formatted for the 24-hour clock as "%tH:%tM"
    'T' Time formatted for the 24-hour clock as "%tH:%tM:%tS".
    'r' Time formatted for the 12-hour clock as "%tI:%tM:%tS %Tp". The location of the morning or afternoon marker ('%Tp') may be locale-dependent.
    'D' Date formatted as "%tm/%td/%ty".
    'F' ISO 8601 complete date formatted as "%tY-%tm-%td".
    'c' Date and time formatted as "%ta %tb %td %tT %tZ %tY", e.g. "Sun Jul 20 16:17:00 EDT 1969".

    Any characters not explicitly defined as date/time conversion suffixes are illegal and are reserved for future extensions.

    Flags

    The following table summarizes the supported flags. y means the flag is supported for the indicated argument types.

    Flag General Character Integral Floating Point Date/Time Description
    '-' y y y y y The result will be left-justified.
    '#' y1 - y3 y - The result should use a conversion-dependent alternate form
    '+' - - y4 y - The result will always include a sign
    '  ' - - y4 y - The result will include a leading space for positive values
    '0' - - y y - The result will be zero-padded
    ',' - - y2 y5 - The result will include locale-specific grouping separators
    '(' - - y4 y5 - The result will enclose negative numbers in parentheses

    1 Depends on the definition of Formattable.

    2 For 'd' conversion only.

    3 For 'o', 'x', and 'X' conversions only.

    4 For 'd', 'o', 'x', and 'X' conversions applied to BigInteger or 'd' applied to byte, Byte, short, Short, int and Integer, long, and Long.

    5 For 'e', 'E', 'f', 'g', and 'G' conversions only.

    Any characters not explicitly defined as flags are illegal and are reserved for future extensions.

    Width

    The width is the minimum number of characters to be written to the output. For the line separator conversion, width is not applicable; if it is provided, an exception will be thrown.

    Precision

    For general argument types, the precision is the maximum number of characters to be written to the output.

    For the floating-point conversions 'a', 'A', 'e', 'E', and 'f' the precision is the number of digits after the radix point. If the conversion is 'g' or 'G', then the precision is the total number of digits in the resulting magnitude after rounding.

    For character, integral, and date/time argument types and the percent and line separator conversions, the precision is not applicable; if a precision is provided, an exception will be thrown.

    Argument Index

    The argument index is a decimal integer indicating the position of the argument in the argument list. The first argument is referenced by "1$", the second by "2$", etc.

    Another way to reference arguments by position is to use the '<' ('\u003c') flag, which causes the argument for the previous format specifier to be re-used.


    Summary of regular-expression constructs

    Construct Matches
     
    Characters
    x The character x
    \\ The backslash character
    \0n The character with octal value 0n (0 <= n <= 7)
    \0nn The character with octal value 0nn (0 <= n <= 7)
    \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7)
    \xhh The character with hexadecimal value 0xhh
    \uhhhh The character with hexadecimal value 0xhhhh
    \x{h...h} The character with hexadecimal value 0xh...h (Character.MIN_CODE_POINT  <= 0xh...h <=  Character.MAX_CODE_POINT)
    \t The tab character ('\u0009')
    \n The newline (line feed) character ('\u000A')
    \r The carriage-return character ('\u000D')
    \f The form-feed character ('\u000C')
    \a The alert (bell) character ('\u0007')
    \e The escape character ('\u001B')
    \cx The control character corresponding to x
     
    Character classes
    [abc] a, b, or c (simple class)
    [^abc] Any character except a, b, or c (negation)
    [a-zA-Z] a through z or A through Z, inclusive (range)
    [a-d[m-p]] a through d, or m through p: [a-dm-p] (union)
    [a-z&&[def]] d, e, or f (intersection)
    [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction)
    [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction)
     
    Predefined character classes
    . Any character (may or may not match line terminators)
    \d A digit: [0-9]
    \D A non-digit: [^0-9]
    \h A horizontal whitespace character: [ \t\xA0\u1680\u180e\u2000-\u200a\u202f\u205f\u3000]
    \H A non-horizontal whitespace character: [^\h]
    \s A whitespace character: [ \t\n\x0B\f\r]
    \S A non-whitespace character: [^\s]
    \v A vertical whitespace character: [\n\x0B\f\r\x85\u2028\u2029]
    \V A non-vertical whitespace character: [^\v]
    \w A word character: [a-zA-Z_0-9]
    \W A non-word character: [^\w]
     
    POSIX character classes (US-ASCII only)
    \p{Lower} A lower-case alphabetic character: [a-z]
    \p{Upper} An upper-case alphabetic character:[A-Z]
    \p{ASCII} All ASCII:[\x00-\x7F]
    \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}]
    \p{Digit} A decimal digit: [0-9]
    \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}]
    \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
    \p{Graph} A visible character: [\p{Alnum}\p{Punct}]
    \p{Print} A printable character: [\p{Graph}\x20]
    \p{Blank} A space or a tab: [ \t]
    \p{Cntrl} A control character: [\x00-\x1F\x7F]
    \p{XDigit} A hexadecimal digit: [0-9a-fA-F]
    \p{Space} A whitespace character: [ \t\n\x0B\f\r]
     
    java.lang.Character classes (simple java character type)
    \p{javaLowerCase} Equivalent to java.lang.Character.isLowerCase()
    \p{javaUpperCase} Equivalent to java.lang.Character.isUpperCase()
    \p{javaWhitespace} Equivalent to java.lang.Character.isWhitespace()
    \p{javaMirrored} Equivalent to java.lang.Character.isMirrored()
     
    Classes for Unicode scripts, blocks, categories and binary properties
    \p{IsLatin} A Latin script character (script)
    \p{InGreek} A character in the Greek block (block)
    \p{Lu} An uppercase letter (category)
    \p{IsAlphabetic} An alphabetic character (binary property)
    \p{Sc} A currency symbol
    \P{InGreek} Any character except one in the Greek block (negation)
    [\p{L}&&[^\p{Lu}]] Any letter except an uppercase letter (subtraction)
     
    Boundary matchers
    ^ The beginning of a line
    $ The end of a line
    \b A word boundary
    \B A non-word boundary
    \A The beginning of the input
    \G The end of the previous match
    \Z The end of the input but for the final terminator, if any
    \z The end of the input
     
    Linebreak matcher
    \R Any Unicode linebreak sequence, is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]
     
    Greedy quantifiers
    X? X, once or not at all
    X* X, zero or more times
    X+ X, one or more times
    X{n} X, exactly n times
    X{n,} X, at least n times
    X{n,m} X, at least n but not more than m times
     
    Reluctant quantifiers
    X?? X, once or not at all
    X*? X, zero or more times
    X+? X, one or more times
    X{n}? X, exactly n times
    X{n,}? X, at least n times
    X{n,m}? X, at least n but not more than m times
     
    Possessive quantifiers
    X?+ X, once or not at all
    X*+ X, zero or more times
    X++ X, one or more times
    X{n}+ X, exactly n times
    X{n,}+ X, at least n times
    X{n,m}+ X, at least n but not more than m times
     
    Logical operators
    XY X followed by Y
    X|Y Either X or Y
    (X) X, as a capturing group
     
    Back references
    \n Whatever the nth capturing group matched
    \k<name> Whatever the named-capturing group "name" matched
     
    Quotation
    \ Nothing, but quotes the following character
    \Q Nothing, but quotes all characters until \E
    \E Nothing, but ends quoting started by \Q
     
    Special constructs (named-capturing and non-capturing)
    (?<name>X) X, as a named-capturing group
    (?:X) X, as a non-capturing group
    (?idmsuxU-idmsuxU)  Nothing, but turns match flags i d m s u x U on - off
    (?idmsux-idmsux:X)   X, as a non-capturing group with the given flags i d m s u x on - off
    (?=X) X, via zero-width positive lookahead
    (?!X) X, via zero-width negative lookahead
    (?<=X) X, via zero-width positive lookbehind
    (?<!X) X, via zero-width negative lookbehind
    (?>X) X, as an independent, non-capturing group