🔬 Unicode Converter
Convert text to various Unicode representations and decode them back. Handles emoji and surrogate pairs correctly.
Last updated: May 18, 2026 · By Λ
Decode Unicode Back to Text
Paste any encoded format (U+, \u, &#, etc.) and it will be auto-detected and decoded.
Free Unicode Converter
Convert any text to Unicode code points, JavaScript escape sequences, HTML decimal and hex entities, CSS escapes, Python escape sequences, UTF-8 hex bytes, and UTF-16 hex values. Decode any of these formats back to plain text with automatic format detection. The character inspector shows the name, code point, byte encoding, and Unicode category for every character in your input. Handles emoji, CJK characters, and supplementary plane characters (surrogate pairs) correctly. Every conversion runs inside this browser tab with no backend involved, so the text you analyze is never sent anywhere.
What is Unicode?
Unicode is a universal character encoding standard that assigns a unique numeric code point to every character across all writing systems, symbols, and emoji. It covers over 150,000 characters from more than 150 scripts, including Latin, Arabic, Chinese, Japanese, Korean, Cyrillic, and many more. Unicode replaced the older patchwork of encoding systems (ASCII, ISO 8859, Shift JIS, and others) that often caused garbled text when documents moved between different systems or languages.
Developers frequently need to convert text into various Unicode representations for different programming contexts. JavaScript uses \u escapes, HTML uses numeric entities, CSS uses backslash hex escapes, and Python has its own escape syntax. This converter handles all of these formats instantly, plus raw UTF-8 byte sequences and UTF-16 hex values. It correctly processes emoji and characters from supplementary planes that require surrogate pairs in UTF-16. Whether you are debugging encoding issues, preparing strings for source code, or inspecting individual characters, this tool gives you every representation you need in one place.
How to Use This Tool
- Type or paste any text into the input field at the top. You can also click one of the sample buttons (Hello World, Emoji Test, CJK Sample) to load example text.
- View the converted output instantly in all formats: Unicode code points, JavaScript escapes, HTML entities (decimal and hex), CSS escapes, Python escapes, UTF-8 bytes, and UTF-16 hex.
- To decode an escaped string back to plain text, scroll to the Decode section, paste your escaped string, and click "Decode." The tool auto-detects the format.
- Use the Character Inspector at the bottom to see detailed information for each character, including its official Unicode name, code point, byte encoding, and category.
Key Features
- Seven Output Formats - Convert text to Unicode code points, JavaScript, HTML (decimal and hex), CSS, Python escapes, UTF-8 bytes, and UTF-16 hex simultaneously.
- Smart Decoding - Paste any escaped string and the tool automatically detects whether it uses JavaScript, HTML, CSS, Python, or code point format, then converts it back to plain text.
- Character Inspector - A detailed table showing each character's glyph, code point, UTF-8 bytes, UTF-16 units, Unicode name, and general category.
- Full Emoji Support - Correctly handles multi-byte emoji, skin tone modifiers, flag sequences, and characters from supplementary Unicode planes using surrogate pair encoding.
- Real-Time Conversion - Output updates as you type, with no need to click a button. Instant feedback makes it easy to experiment with different characters.
Frequently Asked Questions
What is the difference between UTF-8 and UTF-16?
UTF-8 and UTF-16 are both encodings of the Unicode standard, but they use different byte strategies. UTF-8 uses 1 to 4 bytes per character and is backward-compatible with ASCII, making it the dominant encoding on the web. UTF-16 uses 2 or 4 bytes per character and is used internally by JavaScript, Java, and Windows. Characters outside the Basic Multilingual Plane require surrogate pairs in UTF-16.
Why do some emoji show as two code points?
Many modern emoji are composed of multiple Unicode code points joined together. For example, family emoji combine several person characters with Zero Width Joiner (U+200D) characters. Skin tone variants append a modifier code point after the base emoji. The Character Inspector in this tool shows each component separately so you can understand the full sequence.
When should I use HTML entities versus JavaScript escapes?
Use HTML entities (like 😀 or 😀) when embedding special characters directly in HTML source code. Use JavaScript escapes (like \u0048 or \u{1F600}) when defining strings in JavaScript source files. CSS escapes (like \0048) are used in stylesheet content properties and selectors. Each format is specific to its context.
Can this tool handle right-to-left scripts like Arabic and Hebrew?
Yes. The converter works with any Unicode text regardless of script direction. Right-to-left characters are converted to their code points and escape sequences just like left-to-right characters. The Character Inspector will show the Unicode bidirectional category for each character, helping you understand text directionality.