| locale(5) | File Formats Manual | locale(5) |
NAME¶
locale - describes a locale definition file
DESCRIPTION¶
The locale definition file contains all the information that the localedef(1) command needs to convert it into the binary locale database.
The definition files consist of sections which each describe a locale category in detail. See locale(7) for additional details for these categories.
Syntax¶
The locale definition file starts with a header that may consist of the following keywords:
escape_char is followed by a character that should be used as the escape-character for the rest of the file to mark characters that should be interpreted in a special way. It defaults to the backslash (\). comment_char is followed by a character that will be used as the comment-character for the rest of the file. It defaults to the number sign (#).The locale definition has one part for each locale category. Each part can be copied from another existing locale or can be defined from scratch. If the category should be copied, the only valid keyword in the definition is copy followed by the name of the locale in double quotes which should be copied. The exceptions for this rule are LC_COLLATE and LC_CTYPE where a copy statement can be followed by locale-specific rules and selected overrides.
When defining a locale or a category from scratch, an existing system- provided locale definition file should be used as a reference to follow common glibc conventions.
Locale category sections¶
The following category sections are defined by POSIX:
• LC_CTYPE • LC_COLLATE • LC_MESSAGES • LC_MONETARY • LC_NUMERIC • LC_TIMEIn addition, since glibc 2.2, the GNU C library supports the following nonstandard categories:
• LC_ADDRESS • LC_IDENTIFICATION • LC_MEASUREMENT • LC_NAME • LC_PAPER • LC_TELEPHONESee locale(7) for a more detailed description of each category.
LC_ADDRESS¶
The definition starts with the string LC_ADDRESS in the first column.
The following keywords are allowed:
postal_fmt followed by a string containing field descriptors that define the format used for postal addresses in the locale. The following field descriptors are recognized:Each field descriptor may have an 'R' after the '%' to specify that the information is taken from a Romanized version string of the entity.
The LC_ADDRESS definition ends with the string END LC_ADDRESS.
LC_CTYPE¶
The definition starts with the string LC_CTYPE in the first column.
The following keywords are allowed:
upper followed by a list of uppercase letters. The letters A through Z are included automatically. Characters also specified as cntrl, digit, punct, or space are not allowed. lower followed by a list of lowercase letters. The letters a through z are included automatically. Characters also specified as cntrl, digit, punct, or space are not allowed. alpha followed by a list of letters. All character specified as either upper or lower are automatically included. Characters also specified as cntrl, digit, punct, or space are not allowed. digit followed by the characters classified as numeric digits. Only the digits 0 through 9 are allowed. They are included by default in this class. space followed by a list of characters defined as white-space characters. Characters also specified as upper, lower, alpha, digit, graph, or xdigit are not allowed. The characters <space>, <form-feed>, <newline>, <carriage-return>, <tab>, and <vertical-tab> are automatically included. cntrl followed by a list of control characters. Characters also specified as upper, lower, alpha, digit, punct, graph, print, or xdigit are not allowed. punct followed by a list of punctuation characters. Characters also specified as upper, lower, alpha, digit, cntrl, xdigit, or the <space> character are not allowed. graph followed by a list of printable characters, not including the <space> character. The characters defined as upper, lower, alpha, digit, xdigit, and punct are automatically included. Characters also specified as cntrl are not allowed. print followed by a list of printable characters, including the <space> character. The characters defined as upper, lower, alpha, digit, xdigit, punct, and the <space> character are automatically included. Characters also specified as cntrl are not allowed. xdigit followed by a list of characters classified as hexadecimal digits. The decimal digits must be included followed by one or more set of six characters in ascending order. The following characters are included by default: 0 through 9, a through f, A through F. blank followed by a list of characters classified as blank. The characters <space> and <tab> are automatically included. charclass followed by a list of locale-specific character class names which are then to be defined in the locale. toupper followed by a list of mappings from lowercase to uppercase letters. Each mapping is a pair of a lowercase and an uppercase letter separated with a , and enclosed in parentheses. tolower followed by a list of mappings from uppercase to lowercase letters. If the keyword tolower is not present, the reverse of the toupper list is used. map totitle followed by a list of mapping pairs of characters and letters to be used in titles (headings). class followed by a locale-specific character class definition, starting with the class name followed by the characters belonging to the class. charconv followed by a list of locale-specific character mapping names which are then to be defined in the locale. outdigit followed by a list of alternate output digits for the locale. map to_inpunct followed by a list of mapping pairs of alternate digits and separators for input digits for the locale. map to_outpunct followed by a list of mapping pairs of alternate separators for output for the locale. translit_start marks the start of the transliteration rules section. The section can contain the include keyword in the beginning followed by locale-specific rules and overrides. Any rule specified in the locale file will override any rule copied or included from other files. In case of duplicate rule definitions in the locale file, only the first rule is used. A transliteration rule consist of a character to be transliterated followed by a list of transliteration targets separated by semicolons. The first target which can be presented in the target character set is used, if none of them can be used the default_missing character will be used instead. include in the transliteration rules section includes a transliteration rule file (and optionally a repertoire map file). default_missing in the transliteration rules section defines the default character to be used for transliteration where none of the targets cannot be presented in the target character set. translit_end marks the end of the transliteration rules.The LC_CTYPE definition ends with the string END LC_CTYPE.
LC_COLLATE¶
Note that glibc does not support all POSIX-defined options, only the options described below are supported (as of glibc 2.23).
The definition starts with the string LC_COLLATE in the first column.
The following keywords are allowed:
coll_weight_max followed by the number representing used collation levels. This keyword is recognized but ignored by glibc. collating-element followed by the definition of a collating-element symbol representing a multicharacter collating element. collating-symbol followed by the definition of a collating symbol that can be used in collation order statements. define followed by string to be evaluated in an ifdef string / else / endif construct. reorder-after followed by a redefinition of a collation rule. reorder-end marks the end of the redefinition of a collation rule. reorder-sections-after followed by a script name to reorder listed scripts after. reorder-sections-end marks the end of the reordering of sections. script followed by a declaration of a script. symbol-equivalence followed by a collating-symbol to be equivalent to another defined collating-symbol.The collation rule definition starts with a line:
order_start followed by a list of keywords chosen from forward, backward, or position. The order definition consists of lines that describe the collation order and is terminated with the keyword order_end.The LC_COLLATE definition ends with the string END LC_COLLATE.
LC_IDENTIFICATION¶
The definition starts with the string LC_IDENTIFICATION in the first column.
The following keywords are allowed:
title followed by the title of the locale document (e.g., "Maori language locale for New Zealand"). source followed by the name of the organization that maintains this document. address followed by the address of the organization that maintains this document. contact followed by the name of the contact person at the organization that maintains this document. email followed by the email address of the person or organization that maintains this document. tel followed by the telephone number (in international format) of the organization that maintains this document. As of glibc 2.24, this keyword is deprecated in favor of other contact methods. fax followed by the fax number (in international format) of the organization that maintains this document. As of glibc 2.24, this keyword is deprecated in favor of other contact methods. language followed by the name of the language to which this document applies. territory followed by the name of the country/geographic extent to which this document applies. audience followed by a description of the audience for which this document is intended. application followed by a description of any special application for which this document is intended. abbreviation followed by the short name for provider of the source of this document. revision followed by the revision number of this document. date followed by the revision date of this document.In addition, for each of the categories defined by the document, there should be a line starting with the keyword category, followed by:
(1) a string that identifies this locale category definition, (2) a semicolon, and (3) one of the LC_* identifiers.The LC_IDENTIFICATION definition ends with the string END LC_IDENTIFICATION.
LC_MESSAGES¶
The definition starts with the string LC_MESSAGES in the first column.
The following keywords are allowed:
yesexpr followed by a regular expression that describes possible yes-responses. noexpr followed by a regular expression that describes possible no-responses. yesstr followed by the output string corresponding to "yes". nostr followed by the output string corresponding to "no".The LC_MESSAGES definition ends with the string END LC_MESSAGES.
LC_MEASUREMENT¶
The definition starts with the string LC_MEASUREMENT in the first column.
The following keywords are allowed:
measurement followed by number identifying the standard used for measurement. The following values are recognized:The LC_MEASUREMENT definition ends with the string END LC_MEASUREMENT.
LC_MONETARY¶
The definition starts with the string LC_MONETARY in the first column.
The following keywords are allowed:
int_curr_symbol followed by the international currency symbol. This must be a 4-character string containing the international currency symbol as defined by the ISO 4217 standard (three characters) followed by a separator. currency_symbol followed by the local currency symbol. mon_decimal_point followed by the single-character string that will be used as the decimal delimiter when formatting monetary quantities. mon_thousands_sep followed by the single-character string that will be used as a group separator when formatting monetary quantities. mon_grouping followed by a sequence of integers separated by semicolons that describe the formatting of monetary quantities. See grouping below for details. positive_sign followed by a string that is used to indicate a positive sign for monetary quantities. negative_sign followed by a string that is used to indicate a negative sign for monetary quantities. int_frac_digits followed by the number of fractional digits that should be used when formatting with the int_curr_symbol. frac_digits followed by the number of fractional digits that should be used when formatting with the currency_symbol. p_cs_precedes followed by an integer that indicates the placement of currency_symbol for a nonnegative formatted monetary quantity:The LC_MONETARY definition ends with the string END LC_MONETARY.
LC_NAME¶
The definition starts with the string LC_NAME in the first column.
Various keywords are allowed, but only name_fmt is mandatory. Other keywords are needed only if there is common convention to use the corresponding salutation in this locale. The allowed keywords are as follows:
name_fmt followed by a string containing field descriptors that define the format used for names in the locale. The following field descriptors are recognized:The LC_NAME definition ends with the string END LC_NAME.
LC_NUMERIC¶
The definition starts with the string LC_NUMERIC in the first column.
The following keywords are allowed:
decimal_point followed by the single-character string that will be used as the decimal delimiter when formatting numeric quantities. thousands_sep followed by the single-character string that will be used as a group separator when formatting numeric quantities. grouping followed by a sequence of integers separated by semicolons that describe the formatting of numeric quantities. Each integer specifies the number of digits in a group. The first integer defines the size of the group immediately to the left of the decimal delimiter. Subsequent integers define succeeding groups to the left of the previous group. If the last integer is not -1, then the size of the previous group (if any) is repeatedly used for the remainder of the digits. If the last integer is -1, then no further grouping is performed.The LC_NUMERIC definition ends with the string END LC_NUMERIC.
LC_PAPER¶
The definition starts with the string LC_PAPER in the first column.
The following keywords are allowed:
height followed by the height, in millimeters, of the standard paper format. width followed by the width, in millimeters, of the standard paper format.The LC_PAPER definition ends with the string END LC_PAPER.
LC_TELEPHONE¶
The definition starts with the string LC_TELEPHONE in the first column.
The following keywords are allowed:
tel_int_fmt followed by a string that contains field descriptors that identify the format used to dial international numbers. The following field descriptors are recognized:The LC_TELEPHONE definition ends with the string END LC_TELEPHONE.
LC_TIME¶
The definition starts with the string LC_TIME in the first column.
The following keywords are allowed:
abday followed by a list of abbreviated names of the days of the week. The list starts with the first day of the week as specified by week (Sunday by default). See NOTES. day followed by a list of names of the days of the week. The list starts with the first day of the week as specified by week (Sunday by default). See NOTES. abmon followed by a list of abbreviated month names. mon followed by a list of month names. d_t_fmt followed by the appropriate date and time format (for syntax, see strftime(3)). d_fmt followed by the appropriate date format (for syntax, see strftime(3)). t_fmt followed by the appropriate time format (for syntax, see strftime(3)). am_pm followed by the appropriate representation of the am and pm strings. This should be left empty for locales not using AM/PM convention. t_fmt_ampm followed by the appropriate time format (for syntax, see strftime(3)) when using 12h clock format. This should be left empty for locales not using AM/PM convention. era followed by semicolon-separated strings that define how years are counted and displayed for each era in the locale. Each string has the following format:direction:offset:start_date:end_date:era_name:era_format
The fields are to be defined as follows:
direction Either + or -. + means the years closer to start_date have lower numbers than years closer to end_date. - means the opposite. offset The number of the year closest to start_date in the era, corresponding to the %Ey descriptor (see strptime(3)). start_date The start of the era in the form of yyyy/mm/dd. Years prior AD 1 are represented as negative numbers. end_date The end of the era in the form of yyyy/mm/dd, or one of the two special values of -* or +*. -* means the ending date is the beginning of time. +* means the ending date is the end of time. era_name The name of the era corresponding to the %EC descriptor (see strptime(3)). era_format The format of the year in the era corresponding to the %EY descriptor (see strptime(3)).The LC_TIME definition ends with the string END LC_TIME.
FILES¶
/usr/lib/locale/locale-archive Usual default locale archive location. /usr/share/i18n/locales Usual default path for locale definition files.STANDARDS¶
POSIX.2.
NOTES¶
The collective GNU C library community wisdom regarding abday, day, week, first_weekday, and first_workday states at https://sourceware.org/glibc/wiki/Locales the following:
• The value of the second week list item specifies the base of the abday and day lists. • first_weekday specifies the offset of the first day-of-week in the abday and day lists. • For compatibility reasons, all glibc locales should set the value of the second week list item to 19971130 (Sunday) and base the abday and day lists appropriately, and set first_weekday and first_workday to 1 or 2, depending on whether the week and work week actually starts on Sunday or Monday for the locale.SEE ALSO¶
iconv(1), locale(1), localedef(1), localeconv(3), newlocale(3), setlocale(3), strftime(3), strptime(3), uselocale(3), charmap(5), charsets(7), locale(7), unicode(7), utf-8(7)
| 2024-06-15 | Linux man-pages 6.9.1 |