Difference Between Python 2 And 3 For Utf-8

December 27, 2023 Post a Comment

Why is the output different for the two commands below? $ python2.7 -c 'print('\303\251')' é # <-- Great $ python3.6 -c 'print('\303\251')' Ã© # <-- WTF?! What woul

Solution 1:

On Python 2, you are telling Python to print two bytes. It prints two bytes. Your terminal interprets those two bytes as an encoding of é and displays é. (It looks like your terminal is using UTF8.)

On Python 3, you are telling Python to print the two characters with Unicode code points 0o303 and 0o251 (in octal). Those characters are Ã©. Python encodes those characters in a system-dependent encoding (probably UTF8) and writes the resulting bytes to stdout. Your terminal then decodes the bytes and displays Ã©.

If you want Python 3 to print é, give it the Unicode code point (\u00e9), or just tell it to print é:

$ python3.6 -c 'print("é")'
é

Solution 2:

As explained in the first answer by user2357112, this line tells Python 3 to print two characters indicated by their octal value (an octal byte indicates the unicode code point of the character):

$ python3.6 -c 'print("\303\251")'
Ã©

The following line can be used for a behavior similar to Python 2:

$ python3.6 -c 'print(b"\303\251".decode())'
é

Python Manual

Difference Between Python 2 And 3 For Utf-8

Solution 1:

Solution 2:

Post a Comment for "Difference Between Python 2 And 3 For Utf-8"