core/homeassistant/components/scrape
David Beitey 7a73c6adf7
scrape: extract strings from new non-text tags (#35021)
With the upgrade to beautifulsoup4 to 4.9.0 (#34007), certain tags
(`<style>`, `<script>` and `<template>`) are no longer treated as having
text content (see
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#comments-and-other-special-strings
and reported bug https://bugs.launchpad.net/beautifulsoup/+bug/1868861)
meaning the content of these types of tags became inaccessible to HA.

Where the previous code could access `.text` on the tag, bs4 4.9 now
yields an empty string; these types of tags require accesing `.string`
instead.  This PR checks the tag name (which will aalways be lowercase
given how the parser works;
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#other-parser-problems)
and applies this different access strategy to get the content of the
HTML tag.  All other tags are handled in the original manner.
2020-05-04 10:45:40 +02:00
..
__init__.py
manifest.json
sensor.py scrape: extract strings from new non-text tags (#35021) 2020-05-04 10:45:40 +02:00