← 返回首页
Character (Java SE 26 & JDK 26)
JavaScript is disabled on your browser.
Contents  
  1. Description
    1. Unicode Conformance
    2. Unicode Character Representations
  2. Nested Class Summary
  3. Field Summary
  4. Constructor Summary
  5. Method Summary
  6. Field Details
    1. MIN_RADIX
    2. MAX_RADIX
    3. MIN_VALUE
    4. MAX_VALUE
    5. TYPE
    6. UNASSIGNED
    7. UPPERCASE_LETTER
    8. LOWERCASE_LETTER
    9. TITLECASE_LETTER
    10. MODIFIER_LETTER
    11. OTHER_LETTER
    12. NON_SPACING_MARK
    13. ENCLOSING_MARK
    14. COMBINING_SPACING_MARK
    15. DECIMAL_DIGIT_NUMBER
    16. LETTER_NUMBER
    17. OTHER_NUMBER
    18. SPACE_SEPARATOR
    19. LINE_SEPARATOR
    20. PARAGRAPH_SEPARATOR
    21. CONTROL
    22. FORMAT
    23. PRIVATE_USE
    24. SURROGATE
    25. DASH_PUNCTUATION
    26. START_PUNCTUATION
    27. END_PUNCTUATION
    28. CONNECTOR_PUNCTUATION
    29. OTHER_PUNCTUATION
    30. MATH_SYMBOL
    31. CURRENCY_SYMBOL
    32. MODIFIER_SYMBOL
    33. OTHER_SYMBOL
    34. INITIAL_QUOTE_PUNCTUATION
    35. FINAL_QUOTE_PUNCTUATION
    36. DIRECTIONALITY_UNDEFINED
    37. DIRECTIONALITY_LEFT_TO_RIGHT
    38. DIRECTIONALITY_RIGHT_TO_LEFT
    39. DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
    40. DIRECTIONALITY_EUROPEAN_NUMBER
    41. DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
    42. DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
    43. DIRECTIONALITY_ARABIC_NUMBER
    44. DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
    45. DIRECTIONALITY_NONSPACING_MARK
    46. DIRECTIONALITY_BOUNDARY_NEUTRAL
    47. DIRECTIONALITY_PARAGRAPH_SEPARATOR
    48. DIRECTIONALITY_SEGMENT_SEPARATOR
    49. DIRECTIONALITY_WHITESPACE
    50. DIRECTIONALITY_OTHER_NEUTRALS
    51. DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
    52. DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
    53. DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
    54. DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
    55. DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
    56. DIRECTIONALITY_LEFT_TO_RIGHT_ISOLATE
    57. DIRECTIONALITY_RIGHT_TO_LEFT_ISOLATE
    58. DIRECTIONALITY_FIRST_STRONG_ISOLATE
    59. DIRECTIONALITY_POP_DIRECTIONAL_ISOLATE
    60. MIN_HIGH_SURROGATE
    61. MAX_HIGH_SURROGATE
    62. MIN_LOW_SURROGATE
    63. MAX_LOW_SURROGATE
    64. MIN_SURROGATE
    65. MAX_SURROGATE
    66. MIN_SUPPLEMENTARY_CODE_POINT
    67. MIN_CODE_POINT
    68. MAX_CODE_POINT
    69. SIZE
    70. BYTES
  7. Constructor Details
    1. Character(char)
  8. Method Details
    1. describeConstable()
    2. valueOf(char)
    3. charValue()
    4. hashCode()
    5. hashCode(char)
    6. equals(Object)
    7. toString()
    8. toString(char)
    9. toString(int)
    10. isValidCodePoint(int)
    11. isBmpCodePoint(int)
    12. isSupplementaryCodePoint(int)
    13. isHighSurrogate(char)
    14. isLowSurrogate(char)
    15. isSurrogate(char)
    16. isSurrogatePair(char, char)
    17. charCount(int)
    18. toCodePoint(char, char)
    19. codePointAt(CharSequence, int)
    20. codePointAt(char[], int)
    21. codePointAt(char[], int, int)
    22. codePointBefore(CharSequence, int)
    23. codePointBefore(char[], int)
    24. codePointBefore(char[], int, int)
    25. highSurrogate(int)
    26. lowSurrogate(int)
    27. toChars(int, char[], int)
    28. toChars(int)
    29. codePointCount(CharSequence, int, int)
    30. codePointCount(char[], int, int)
    31. offsetByCodePoints(CharSequence, int, int)
    32. offsetByCodePoints(char[], int, int, int, int)
    33. isLowerCase(char)
    34. isLowerCase(int)
    35. isUpperCase(char)
    36. isUpperCase(int)
    37. isTitleCase(char)
    38. isTitleCase(int)
    39. isDigit(char)
    40. isDigit(int)
    41. isDefined(char)
    42. isDefined(int)
    43. isLetter(char)
    44. isLetter(int)
    45. isLetterOrDigit(char)
    46. isLetterOrDigit(int)
    47. isJavaLetter(char)
    48. isJavaLetterOrDigit(char)
    49. isAlphabetic(int)
    50. isIdeographic(int)
    51. isJavaIdentifierStart(char)
    52. isJavaIdentifierStart(int)
    53. isJavaIdentifierPart(char)
    54. isJavaIdentifierPart(int)
    55. isUnicodeIdentifierStart(char)
    56. isUnicodeIdentifierStart(int)
    57. isUnicodeIdentifierPart(char)
    58. isUnicodeIdentifierPart(int)
    59. isIdentifierIgnorable(char)
    60. isIdentifierIgnorable(int)
    61. isEmoji(int)
    62. isEmojiPresentation(int)
    63. isEmojiModifier(int)
    64. isEmojiModifierBase(int)
    65. isEmojiComponent(int)
    66. isExtendedPictographic(int)
    67. toLowerCase(char)
    68. toLowerCase(int)
    69. toUpperCase(char)
    70. toUpperCase(int)
    71. toTitleCase(char)
    72. toTitleCase(int)
    73. digit(char, int)
    74. digit(int, int)
    75. getNumericValue(char)
    76. getNumericValue(int)
    77. isSpace(char)
    78. isSpaceChar(char)
    79. isSpaceChar(int)
    80. isWhitespace(char)
    81. isWhitespace(int)
    82. isISOControl(char)
    83. isISOControl(int)
    84. getType(char)
    85. getType(int)
    86. forDigit(int, int)
    87. getDirectionality(char)
    88. getDirectionality(int)
    89. isMirrored(char)
    90. isMirrored(int)
    91. compareTo(Character)
    92. compare(char, char)
    93. reverseBytes(char)
    94. getName(int)
    95. codePointOf(String)
Hide sidebar  Show sidebar

Class Character

java.lang.Object
java.lang.Character
All Implemented Interfaces: Serializable, Comparable<Character>, Constable
public final class Character extends Object implements Serializable, Comparable<Character>, Constable
The Character class is the wrapper class for values of the primitive type char. An object of type Character contains a single field whose type is char.

In addition, this class provides a large number of static methods for determining a character's category (lowercase letter, digit, etc.) and for converting characters from uppercase to lowercase and vice versa.

Unicode Conformance

The fields and methods of class Character are defined in terms of character information from the Unicode Standard, specifically the UnicodeData file that is part of the Unicode Character Database. This file specifies properties including name and category for every assigned Unicode code point or character range. The file is available from the Unicode Consortium at http://www.unicode.org.

Character information is based on the Unicode Standard, version 17.0.

The Java platform has supported different versions of the Unicode Standard over time. Upgrades to newer versions of the Unicode Standard occurred in the following Java releases, each indicating the new version:

Shows Java releases and supported Unicode versions Java release Unicode versionJava SE 26 Java SE 24 Java SE 22 Java SE 20 Java SE 19 Java SE 15 Java SE 13 Java SE 12 Java SE 11 Java SE 9 Java SE 8 Java SE 7 Java SE 5.0 Java SE 1.4 JDK 1.1 JDK 1.0.2
Unicode 17.0
Unicode 16.0
Unicode 15.1
Unicode 15.0
Unicode 14.0
Unicode 13.0
Unicode 12.1
Unicode 11.0
Unicode 10.0
Unicode 8.0
Unicode 6.2
Unicode 6.0
Unicode 4.0
Unicode 3.0
Unicode 2.0
Unicode 1.1.5
Variations from these base Unicode versions, such as recognized appendixes, are documented elsewhere.

Unicode Character Representations

The char data type (and therefore the value that a Character object encapsulates) are based on the original Unicode specification, which defined characters as fixed-width 16-bit entities. The Unicode Standard has since been changed to allow for characters whose representation requires more than 16 bits. The range of legal code points is now U+0000 to U+10FFFF, known as Unicode scalar value.

The set of characters from U+0000 to U+FFFF is sometimes referred to as the Basic Multilingual Plane (BMP). Characters whose code points are greater than U+FFFF are called supplementary characters. The Java platform uses the UTF-16 representation in char arrays and in the String and StringBuffer classes. In this representation, supplementary characters are represented as a pair of char values, the first from the high-surrogates range, (\uD800-\uDBFF), the second from the low-surrogates range (\uDC00-\uDFFF).

A char value, therefore, represents Basic Multilingual Plane (BMP) code points, including the surrogate code points, or code units of the UTF-16 encoding. An int value represents all Unicode code points, including supplementary code points. The lower (least significant) 21 bits of int are used to represent Unicode code points and the upper (most significant) 11 bits must be zero. Unless otherwise specified, the behavior with respect to supplementary characters and surrogate char values is as follows:

  • The methods that only accept a char value cannot support supplementary characters. They treat char values from the surrogate ranges as undefined characters. For example, Character.isLetter('\uD840') returns false, even though this specific value if followed by any low-surrogate value in a string would represent a letter.
  • The methods that accept an int value support all Unicode characters, including supplementary characters. For example, Character.isLetter(0x2F81A) returns true because the code point value represents a letter (a CJK ideograph).

In the Java SE API documentation, Unicode code point is used for character values in the range between U+0000 and U+10FFFF, and Unicode code unit is used for 16-bit char values that are code units of the UTF-16 encoding. For more information on Unicode terminology, refer to the Unicode Glossary.

This is a value-based class; programmers should treat instances that are equal as interchangeable and should not use instances for synchronization, or unpredictable behavior may occur. For example, in a future release, synchronization may fail.

Since: 1.0 External Specifications See Also:

Scripting on this page tracks web page traffic, but does not change the content in any way.