EllyTools

Image Tools

Calculators

Text Tools

Color Tools

File Tools

Utility Tools

Unicode Text Fixer

Fix broken Korean, Arabic, Thai, Hindi text — garbled characters & mojibake

Characters: 0

File Name Fixer

Drag & drop files here to fix their names

Or click to select files

Why does this happen?

Different operating systems use different Unicode normalization forms. macOS uses NFD (decomposed) while Windows and most web services use NFC (composed). This mismatch causes text — especially Korean, Japanese dakuten, Arabic, and Thai — to appear broken or garbled.

macOS NFD vs Windows NFC — Korean file names appear as separated consonants/vowels

iOS file transfers and email attachments often use NFD encoding

Copy-pasting text between different applications can introduce encoding mismatches

Example

Broken (NFD)

한글

Fixed (NFC)

한글

How to Use

1

Paste broken or garbled text into the input box

2

Select a language mode or use Auto Detect

3

Click Fix to normalize the Unicode text

4

Copy the fixed text or download as a file

Frequently Asked Questions

Related Tools

Who Is This For?

  • Users who receive garbled Korean file names from macOS
  • Developers dealing with Unicode normalization issues
  • Anyone who gets mojibake text in emails or file transfers
  • International teams working with multilingual text data

Why Choose EllyTools?

100% Free & Unlimited

No sign-up, no limits. Use as many times as you want.

Privacy First

All processing happens in your browser. Your files never leave your device.

No Installation Required

Works directly in your browser on any device — desktop, tablet, or phone.

Fast & Reliable

Instant results powered by modern browser technology.

Unicode Text Fixer: Cleaning Up Broken or Mixed-Encoding Text

Sometimes text comes through with weird artifacts — '’' instead of an apostrophe, '?' boxes where Korean characters should be, mojibake that looks like random symbols. This tool tries to detect and repair common encoding mistakes.

Mojibake (文字化け) is what happens when text is decoded with the wrong character set. UTF-8 bytes interpreted as Windows-1252 (or vice versa) produce the famous '’' for apostrophes, '—' for em-dashes, and '…' for ellipsis. The tool reverses these common mistakes.

Other common cleanup: stripping invisible Unicode characters (zero-width spaces sometimes inserted by phishing or templating systems), normalizing accented characters, removing private-use-area glyphs.

Common fixes

  • ’ → ' (UTF-8 misread as Windows-1252)
  • Â → (extra non-breaking space)
  • Zero-width space (U+200B) and other invisible characters removed
  • Smart quotes back to straight quotes (or vice versa)
  • Composed vs decomposed accents (NFC vs NFD normalization)

Extended FAQ

Can it fix all mojibake?

Common patterns yes — UTF-8 ↔ Windows-1252 mistakes are widely reversible. Some pathological cases (text decoded and re-encoded multiple times) lose information unrecoverably.

Are my pasted strings stored?

No — runs entirely in your browser.