Just 3 lines of code in both Python 2 and Python 3.
Python 2
from HTMLParser import HTMLParser
from xml.sax.saxutils import escape
xml_str = escape(HTMLParser().unescape(html_str))
Python 3
Python 3.3 or older
from html.parser import HTMLParser
from xml.sax.saxutils import escape
xml_str = escape(HTMLParser().unescape(html_str))
Pytho 3.4+
import html
from xml.sax.saxutils import escape
xml_str = escape(html.unescape(html_str))
Reference
Share
Donation
如果覺得這篇文章對你有幫助, 除了留言讓我知道外, 或許也可以考慮請我喝杯咖啡, 不論金額多寡我都會非常感激且能鼓勵我繼續寫出對你有幫助的文章。
If this blog post happens to be helpful to you, besides of leaving a reply, you may consider buy me a cup of coffee to support me. It would help me write more articles helpful to you in the future and I would really appreciate it.
Related Posts
- Python 中讓 urllib 使用 cookie 的方法
- 用 Python 抓出我在前公司貢獻了多少 GitHub commits
- [Python] Sort dictionary by key or value
- [Python] Mutable v.s Hashable
- 嘗試在 Python 中做到 Golang fmt 的效果