Safely Extracting Partial Data From Pickled Objects
Solution 1:
An idea might be to read the pickled objects from the files as strings, then use pickletools.dis
to see what's in them… only allowing a specific list of commands ('STOP
', 'INT
', …) to be in the second column. That would negate the pickle having any of the types of objects that you are worried about, and if you are only targeting a very specific list of basic python objects, you might be able to do this safely.
Here's what you get with pickletools.dis
:
>>>import pickletools>>>import pickle >>>>>>p1 = pickle.dumps(1)>>>p2 = pickle.dumps(min)>>>>>>pickletools.dis(p1)
0: I INT 1
3: . STOP
highest protocol among opcodes = 0
>>>pickletools.dis(p2)
0: c GLOBAL '__builtin__ min'
17: p PUT 0
20: . STOP
highest protocol among opcodes = 0
>>>
It's better than writing a full pickle parser, and possibly doable if you only want to allow simple objects like INT
s.
Solution 2:
You can do this but only if you parse the data yourself, not relying on pickle which could lead to arbitrary code execution. A very simple example of doing could be
import pickle
import re
classTest(object):
def__init__(self, l):
self.internal_list = l
self.foo = 2
self.bar = 24# Create a pickled version of an object
t = Test([1,2,3,4,5,6,7,8,9,10])
pickle.dump(t, open("test.pickle",'w'))
deffind_last_integer(s):
""" Parses a string to return the integer that it ends with
e.g. find_last_integer("foobar312") == 312
"""returnint(re.search(r"\d+$", s).group())
# Load the pickled data
data = open("test.pickle").read()
listdata = data[data.find("(lp"):].split('\n') # Assumes that the class will only contain one list# if you need more then look for all lines starting "(lp"
nelements = find_last_integer(listdata[0])
# Each element of the list should be of the form "In" or "aIn"
reconstructed = [find_last_integer(elem) for elem in listdata[1:nelements+1]]
print reconstructed
Note that I've only tested the above code in python 2.7.8 YMMV if you use it with other versions.
Post a Comment for "Safely Extracting Partial Data From Pickled Objects"