Just 3 lines of code in both Python 2 and Python 3.

Python 2

from HTMLParser import HTMLParser  
from xml.sax.saxutils import escape  

xml_str = escape(HTMLParser().unescape(html_str))  

Python 3

Python 3.3 or older

from html.parser import HTMLParser  
from xml.sax.saxutils import escape  

xml_str = escape(HTMLParser().unescape(html_str))  

Pytho 3.4+

import html  
from xml.sax.saxutils import escape  

xml_str = escape(html.unescape(html_str))  

Reference


Share


Donation

如果覺得這篇文章對你有幫助, 除了留言讓我知道外, 或許也可以考慮請我喝杯咖啡, 不論金額多寡我都會非常感激且能鼓勵我繼續寫出對你有幫助的文章。

If this blog post happens to be helpful to you, besides of leaving a reply, you may consider buy me a cup of coffee to support me. It would help me write more articles helpful to you in the future and I would really appreciate it.


Related Posts