← 返回首页
DataInput (Java SE 26 & JDK 26)
JavaScript is disabled on your browser.
Contents  
  1. Description
    1. Modified UTF-8
  2. Method Summary
  3. Method Details
    1. readFully(byte[])
    2. readFully(byte[], int, int)
    3. skipBytes(int)
    4. readBoolean()
    5. readByte()
    6. readUnsignedByte()
    7. readShort()
    8. readUnsignedShort()
    9. readChar()
    10. readInt()
    11. readLong()
    12. readFloat()
    13. readDouble()
    14. readLine()
    15. readUTF()
Hide sidebar  Show sidebar

Interface DataInput

All Known Subinterfaces: ImageInputStream, ImageOutputStream, ObjectInput All Known Implementing Classes: DataInputStream, FileCacheImageInputStream, FileCacheImageOutputStream, FileImageInputStream, FileImageOutputStream, ImageInputStreamImpl, ImageOutputStreamImpl, MemoryCacheImageInputStream, MemoryCacheImageOutputStream, ObjectInputStream, RandomAccessFile
public interface DataInput
The DataInput interface provides for reading bytes from a binary stream and reconstructing from them data in any of the Java primitive types. There is also a facility for reconstructing a String from data in modified UTF-8 format.

It is generally true of all the reading routines in this interface that if end of file is reached before the desired number of bytes has been read, an EOFException (which is a kind of IOException) is thrown. If any byte cannot be read for any reason other than end of file, an IOException other than EOFException is thrown. In particular, an IOException may be thrown if the input stream has been closed.

Modified UTF-8

Implementations of the DataInput and DataOutput interfaces represent Unicode strings in a format that is a slight modification of UTF-8. (For information regarding the standard UTF-8 format, see section 3.9 Unicode Encoding Forms of The Unicode Standard, Version 4.0)

  • Characters in the range '\u0001' to '\u007F' are represented by a single byte.
  • The null character '\u0000' and characters in the range '\u0080' to '\u07FF' are represented by a pair of bytes.
  • Characters in the range '\u0800' to '\uFFFF' are represented by three bytes.
Encoding of UTF-8 values Value Byte Bit Values 7 6 5 4 3 2 1 0 \u0001 to \u007F 1 \u0000,
\u0080 to \u07FF 1 2 \u0800 to \uFFFF 1 2 3
0 bits 6-0
1 1 0 bits 10-6
1 0 bits 5-0
1 1 1 0 bits 15-12
1 0 bits 11-6
1 0 bits 5-0

The differences between this format and the standard UTF-8 format are the following:

  • The null byte '\u0000' is encoded in 2-byte format rather than 1-byte, so that the encoded strings never have embedded nulls.
  • Only the 1-byte, 2-byte, and 3-byte formats are used.
  • Supplementary characters are represented in the form of surrogate pairs.

Since: 1.0 See Also:

Scripting on this page tracks web page traffic, but does not change the content in any way.