Tool for encoding and decoding Unicode escaped characters, generating \uXXXX or \u{X} sequences, analyzing and converting Unicode text online.
Unicode Escape - dCode
Tag(s) : Character Encoding
dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!
Unicode escaping is a method for representing a Unicode character using its numeric value (code point) rather than the character itself.
This notation typically begins with a backslash \ followed by a prefix and hexadecimal digits.
This abstraction allows for text manipulation in environments where the direct display of a special character is not guaranteed or desired.
To escape a character using Unicode:
— Identify the character's Unicode code point
— Convert this value to hexadecimal
— Apply the appropriate escape format (see below)
Example: The acute accented character Ă© has a code point of 233, which is 0xE9 in hexadecimal, and is written with the escape sequence \u00E9 or Ă©
Unicode escape formats correspond to different ways of representing a code point as text. The most common syntaxes include several conventions used depending on the language, regular expression engine, or serialization system.
— Format \uXXXX: the oldest standard format, a fixed hexadecimal notation of 4 digits. This format is common in Java, JSON, and some parsers but is limited to the Basic Multilingual Plane (BMP), i.e., characters between U+0000 and U+FFFF. For characters outside the BMP, generate two consecutive sequences corresponding to a substitution pair.
— Format \u{X}: the most recent standard format, a variable notation enclosed in curly braces. Represent any code point without length constraints. Syntax used in modern JavaScript, Rust, PHP, and most modern languages except Python.
— Format \UXXXXXXXX: a format used in Python to directly represent complete code points using 8 hexadecimal digits, without resorting to substitution pairs.
— Format \x{X}: a format that replaces u with x, found in some regular expression engines (such as PCRE).
— Format \X: a format used in CSS, offering the simplest notation by using a backslash prefix followed directly by hexadecimal. This approach is sometimes ambiguous because it has historically been linked to octal or hexadecimal escaping, depending on the language.
Decoding a Unicode escape sequence involves:
— Recognizing the pattern: \uXXXX, \u{X}, or other
— Extracting the hexadecimal portion
— Converting the hexadecimal to decimal to obtain the code point
— Interpreting this code point as a Unicode character
Example: \u0041, extract 0041, convert to decimal 65, resulting in the Unicode character A
Most programming languages provide native functions for this process.
Identify a Unicode escape sequence by these characteristic patterns:
— \uXXXX: backslash + u + 4 hexadecimal digits
— \u{X} or \u{XXXX}: flexible notation with curly braces
— \UXXXXXXXX: backslash + U + 8 hexadecimal digits
Common variants include:
\uXXXX: standard 4-digit notation
\u{X}: modern compact notation
\UXXXXXXXX: 8-digit notation used in some languages like Python
\x{X}: alternative notation used by some regular expression engines
HTML: XXXX; (entirely different notation)
URL encoding: %XX
Substitute pairs generated by UTF-16 for code points greater than U+FFFF
dCode retains ownership of the "Unicode Escape" source code. Any algorithm for the "Unicode Escape" algorithm, applet or snippet or script (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or any "Unicode Escape" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) or any database download or API access for "Unicode Escape" or any other element are not public (except explicit open source licence). Same with the download for offline use on PC, mobile, tablet, iPhone or Android app.
Reminder: dCode is an educational and teaching resource, accessible online for free and for everyone.
The content of the page "Unicode Escape" and its results may be freely copied and reused, including for commercial purposes, provided that dCode.fr is cited as the source (Creative Commons CC-BY free distribution license).
Exporting the results is free and can be done simply by clicking on the export icons ⤓ (.csv or .txt format) or ⧉ (copy and paste).
To cite dCode.fr on another website, use the link:
In a scientific article or book, the recommended bibliographic citation is: Unicode Escape on dCode.fr [online website], retrieved on 2025-11-18,