Comment 2 for bug 196174

John S (jcspray) wrote :

The <MedLinePgn/> element was confusing the pubmed's parsing code. Looks like it's a pre-print article and thus doesn't have a page number, but they for some reason put that tag in there anyway.

Regardless, I have committed a fix to the parsing code to tolerate this situation.

Index: plugins/pubmed.py
===================================================================
--- plugins/pubmed.py (revision 705)
+++ plugins/pubmed.py (revision 706)
@@ -73,7 +73,10 @@
        if len(value) == 0:
                return ""
        else:
- return value[0].childNodes[0].data.encode("utf-8")
+ if (len(value[0].childNodes) == 0):
+ return ""
+ else:
+ return value[0].childNodes[0].data.encode("utf-8")

 def text_output(xml):
@@ -131,7 +134,7 @@
        pages = get_field (xmldoc, "MedlinePgn")
        output.append (["pages", pages])

- output2 = [];
+ output2 = []
        for pair in output:
                if len(pair[1]) > 0:
                        output2.append(pair)
@@ -145,12 +148,14 @@
                elif (method == "pubmed"):
                        xml = get_citation_from_pmid (doc.get_field ("pmid"))
        except:
+ print "pubmed.py:resolve_metadata: Got no metadata"
                # Couldn't get any metadata
                return False

        try:
                items = text_output (xml)
        except:
+ print "pubmed.py:resolve_metadata: Couldn't parse metadata"
                # Couldn't parse XML
                return False