...
Another way of profiling pyFF's memory usage is just following RES in top or htop for a long-running pyFF/gunicorn process, that has a 60s refresh interval. I normally use this pipeline
...
Code Block |
---|
from lxml import etree, objectify import pickle # Create pickled datafile source = open("edugain.xml", "r", encoding="utf-8") sink = open("edugain.pkl", "w") t = objectify.parse(source) p = pickle.dumps(t).decode('latin1') sink.write(p) # Read pickled object back in pyFF def parse_xml return pickle.loads(io.encode('latin1')) In metadata parser: t = parse_xml(content) #Instead of parse_xml(unicode_stream(content)) |
Using un/pickling, pyFF's gunicorn starts out using ~800Mb of RES that slowly extends to a steady 1.2-1.5G.
...
Using xml.sax parser pyFF's gunicorn starts out using ~800Mb of RES that slowly extends to a steady 1.2-1.5G.
...