Rpython Ord() With Non-ascii Character
I'm making a virtual machine in RPython using PyPy. My problem is, that I am converting each character into the numerical representation. For example, converting the letter 'a' pro
Solution 1:
#!/usr/bin/env python# -*- coding: latin-1 -*-
char = 'á'printstr(int(ord(char)))
printhex(int(char))
print char.decode('latin-1')
Gives me:
225
0xe1
0xe1
Solution 2:
You are using version 2 of Python language therefore your string: "á"
is a byte string, and its contents depend on the encoding of your source file. If the encoding is UTF-8, they are C3 A1
- the string contains two bytes.
If you want to convert it to Unicode codepoints (aka characters), or UTF-16 codepoints (depending on your Python installation), convert it to unicode
first, for example using .decode('utf-8')
.
# -*- encoding: utf-8 -*-defstuff(instr):
for char in instr:
char = str(int(ord(char)))
char = hex(int(char))
# I'd replace those two lines above with char = hex(ord(char))
char = char[2:]
print char
stuff("á")
print("-------")
stuff(u"á")
Outputs:
c3
a1
-------
e1
Post a Comment for "Rpython Ord() With Non-ascii Character"