[url] Don't try and build an etree out of non-html
Bug #454768 reported by
Stefano Rivera
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ibid |
Fix Released
|
High
|
Stefano Rivera |
Bug Description
Some digging through memory logs shows huge spikes of "Element" objects causing memory to be allocated.
These correlate strongly to the presence of .jpg URLs in the logs.
Probable culprit:
def _get_title(self, url):
"Gets the title of a page"
try:
headers = {'User-Agent': 'Mozilla/5.0'}
etree = get_html_
title = etree.findtext(
return title
Related branches
lp:~stefanor/ibid/url-memory-454768
- Michael Gorven: Approve
- Jonathan Hitchcock: Approve
-
Diff: 66 lines2 files modifiedibid/plugins/url.py (+9/-3)
ibid/utils.py (+8/-0)
Changed in ibid: | |
importance: | Undecided → High |
Changed in ibid: | |
status: | In Progress → Fix Released |
To post a comment you must log in.