The string you received is typical for a charset mismatch.

As you see your test string is not well-formed HTML. You may solve the issue by a prefixing the string with an HTML header defines the correct charset – in your case it should be UTF-8.

Or you can use readHTML() API method with an encoding parameter, i.e.
Try to specify “UTF8” there.