Skip to content Skip to sidebar Skip to footer

Difference Between Python 2 And 3 For Utf-8

Why is the output different for the two commands below? $ python2.7 -c 'print('\303\251')' é # <-- Great $ python3.6 -c 'print('\303\251')' é # <-- WTF?! What woul

Solution 1:

On Python 2, you are telling Python to print two bytes. It prints two bytes. Your terminal interprets those two bytes as an encoding of é and displays é. (It looks like your terminal is using UTF8.)

On Python 3, you are telling Python to print the two characters with Unicode code points 0o303 and 0o251 (in octal). Those characters are é. Python encodes those characters in a system-dependent encoding (probably UTF8) and writes the resulting bytes to stdout. Your terminal then decodes the bytes and displays é.

If you want Python 3 to print é, give it the Unicode code point (\u00e9), or just tell it to print é:

$ python3.6 -c 'print("é")'
é

Solution 2:

As explained in the first answer by user2357112, this line tells Python 3 to print two characters indicated by their octal value (an octal byte indicates the unicode code point of the character):

$ python3.6 -c 'print("\303\251")'
é

The following line can be used for a behavior similar to Python 2:

$ python3.6 -c 'print(b"\303\251".decode())'
é

Post a Comment for "Difference Between Python 2 And 3 For Utf-8"