Files
wg-backend-django/acer-env/lib/python3.10/site-packages/bs4/__pycache__/dammit.cpython-310.pyc

324 lines
27 KiB
Plaintext
Raw Normal View History

2022-11-30 15:58:16 +07:00
o
<00>ԄcƠ<00>@sXdZdZddlmZddlmZddlZddlZddlZddl Z dZ
zddl Z
Wn+e ySzddl Z
Wne yPzddlZ
Wn e yMdZ
YnwYnwYnwe
r[dd<07>Zndd<07>Zd Zd
Ze<12>Ze<07>e<11>d <0B>ej<16>e<07>e<10>d <0B>ej<16>d <0C>ee<e<07>eej<16>e<07>eej<16>d <0C>ee<dd lmZGdd<0F>de<1A>ZGdd<11>d<11>ZGdd<13>d<13>ZdS)aBBeautiful Soup bonus library: Unicode, Dammit
This library converts a bytestream to Unicode through any means
necessary. It is heavily based on code from Mark Pilgrim's Universal
Feed Parser. It works best on XML and HTML, but it does not rewrite the
XML or HTML to reflect a new encoding; that's the tree builder's job.
<EFBFBD>MIT<49>)<01>codepoint2name)<01> defaultdictNcCst|t<01>rdSt<02>|<00>dS)N<>encoding)<04>
isinstance<EFBFBD>str<74>chardet_module<6C>detect<63><01>s<>r <00>]/home/infidel/Sync/Project/ocp-wg-backend/acer-env/lib/python3.10/site-packages/bs4/dammit.py<70>chardet_dammit+s
rcCsdS<00>Nr r
r r r r0sz$^\s*<\?.*encoding=['"](.*?)['"].*\?>z0<\s*meta[^>]+charset\s*=\s*["']?([^>]*?)[ /;'">]<5D>ascii)<02>html<6D>xml)<01>html5c@s<>eZdZdZdd<03>Ze<04>\ZZZdddddd <09>Ze <09>
d
<EFBFBD>Z e <09>
d <0B>Z e d d <0A><00>Ze dd<0F><00>Ze dd<11><00>Ze ddd<14><01>Ze ddd<16><01>Ze dd<18><00>ZdS)<1B>EntitySubstitutionzFThe ability to substitute XML or HTML entities for certain characters.cCs<>i}i}t<00>}tt<00>}tt<03><04><00>D]T\}}|<04>d<01>r!|dd<03>}n|}||vr+|||<|||<t|<05>dkr@t|<05>dkr@|dvr@qt|<05>dkrPtdd<08>|D<00><01>rPqt|<05>dkr\|<02> |<05>q||d <00> |<05>qt<00>}|D]!}||} | sy|<07> |<08>qkd
<EFBFBD>
d d <0C>| D<00><01>}
|<07> d ||
f<00>qkt |<03> <0C><00>D] } | D]} |<07> | <0C>q<>q<EFBFBD>dd<0F>
|<07>} t t <0A><04><00>D] \}}t|<0E>}|||<q<>||t<0F>| <0A>fS)u<>Initialize variables used by this class to manage the plethora of
HTML5 named entities.
This function returns a 3-tuple containing two dictionaries
and a regular expression:
unicode_to_name - A mapping of Unicode strings like "⦨" to
entity names like "angmsdaa". When a single Unicode string has
multiple entity names, we try to choose the most commonly-used
name.
name_to_unicode: A mapping of entity names like "angmsdaa" to
Unicode strings like "⦨".
named_entity_re: A regular expression matching (almost) any
Unicode string that corresponds to an HTML5 named entity.
<20>;N<><4E><EFBFBD><EFBFBD><EFBFBD><EFBFBD><00><>z<>&css<00>|] }t|<01>dkVqdS)rN)<01>ord<72><02>.0<EFBFBD>xr r r <00> <genexpr><3E>s<02>z?EntitySubstitution._populate_class_variables.<locals>.<genexpr>r<00>cSsg|]}|d<00>qS)rr rr r r <00>
<listcomp><3E><00>z@EntitySubstitution._populate_class_variables.<locals>.<listcomp>z
%s(?![%s])z(%s)<29>|)<11>setr<00>sortedr<00>items<6D>endswith<74>lenr<00>all<6C>add<64>join<69>list<73>valuesr<00>chr<68>re<72>compile)<0F>unicode_to_name<6D>name_to_unicode<64>short_entities<65> long_entities_by_first_character<65>name_with_semicolon<6F> character<65>name<6D> particles<65>short<72> long_versions<6E>ignore<72> long_entities<65> long_entity<74> re_definition<6F> codepointr r r <00>_populate_class_variablesFsH
<02>    <02>
z,EntitySubstitution._populate_class_variables<65>apos<6F>quot<6F>amp<6D>lt<6C>gt)<05>'<27>"<22>&<26><<3C>>z&([<>]|&(?!#\d+;|#x[0-9a-fA-F]+;|\w+;))z([<>&])cCs|j<00>|<01>d<01><01>}d|S)zpUsed with a regular expression to substitute the
appropriate HTML entity for a special character string.r<00>&%s;)<03>CHARACTER_TO_HTML_ENTITY<54>get<65>group<75><03>cls<6C>matchobj<62>entityr r r <00>_substitute_html_entity<74>sz*EntitySubstitution._substitute_html_entitycCs|j|<01>d<01>}d|S)zoUsed with a regular expression to substitute the
appropriate XML entity for a special character string.rrI)<02>CHARACTER_TO_XML_ENTITYrLrMr r r <00>_substitute_xml_entity<74>sz)EntitySubstitution._substitute_xml_entitycCs6d}d|vrd|vrd}|<01>d|<03>}nd}|||S)a*Make a value into a quoted XML attribute, possibly escaping it.
Most strings will be quoted using double quotes.
Bob's Bar -> "Bob's Bar"
If a string contains double quotes, it will be quoted using
single quotes.
Welcome to "my bar" -> 'Welcome to "my bar"'
If a string contains both single and double quotes, the
double quotes will be escaped, and the string will be quoted
using double quotes.
Welcome to "Bob's Bar" -> "Welcome to &quot;Bob's bar&quot;
rErDz&quot;)<01>replace)<04>self<6C>value<75>
quote_with<EFBFBD> replace_withr r r <00>quoted_attribute_value<75>s z)EntitySubstitution.quoted_attribute_valueFcC<00>"|j<00>|j|<01>}|r|<00>|<01>}|S)a Substitute XML entities for special XML characters.
:param value: A string to be substituted. The less-than sign
will become &lt;, the greater-than sign will become &gt;,
and any ampersands will become &amp;. If you want ampersands
that appear to be part of an entity definition to be left
alone, use substitute_xml_containing_entities() instead.
:param make_quoted_attribute: If True, then the string will be
quoted, as befits an attribute value.
)<04>AMPERSAND_OR_BRACKET<45>subrSrY<00>rNrV<00>make_quoted_attributer r r <00>substitute_xmls <04>
z!EntitySubstitution.substitute_xmlcCrZ)a<>Substitute XML entities for special XML characters.
:param value: A string to be substituted. The less-than sign will
become &lt;, the greater-than sign will become &gt;, and any
ampersands that are not part of an entity defition will
become &amp;.
:param make_quoted_attribute: If True, then the string will be
quoted, as befits an attribute value.
)<04>BARE_AMPERSAND_OR_BRACKETr\rSrYr]r r r <00>"substitute_xml_containing_entitiess <04>
z5EntitySubstitution.substitute_xml_containing_entitiescCs|j<00>|j|<01>S)aReplace certain Unicode characters with named HTML entities.
This differs from data.encode(encoding, 'xmlcharrefreplace')
in that the goal is to make the result more readable (to those
with ASCII displays) rather than to recover from
errors. There's absolutely nothing wrong with a UTF-8 string
containg a LATIN SMALL LETTER E WITH ACUTE, but replacing that
character with "&eacute;" will make it more readable to some
people.
:param s: A Unicode string.
)<03>CHARACTER_TO_HTML_ENTITY_REr\rQ)rNr r r r <00>substitute_html+s<04>z"EntitySubstitution.substitute_htmlN)F)<14>__name__<5F>
__module__<EFBFBD> __qualname__<5F>__doc__r>rJ<00>HTML_ENTITY_TO_CHARACTERrbrRr-r.r`r[<00> classmethodrQrSrYr_rarcr r r r rCs6w<06><06>




$  <0C>rc@sNeZdZdZ   ddd<05>Zdd<07>Zedd <09><00>Zed
d <0B><00>Z edd d <0A><01>Z
dS)<10>EncodingDetectoraLSuggests a number of possible encodings for a bytestring.
Order of precedence:
1. Encodings you specifically tell EncodingDetector to try first
(the known_definite_encodings argument to the constructor).
2. An encoding determined by sniffing the document's byte-order mark.
3. Encodings you specifically tell EncodingDetector to try if
byte-order mark sniffing fails (the user_encodings argument to the
constructor).
4. An encoding declared within the bytestring itself, either in an
XML declaration (if the bytestring is to be interpreted as an XML
document), or in a <meta> tag (if the bytestring is to be
interpreted as an HTML document.)
5. An encoding detected through textual analysis by chardet,
cchardet, or a similar external library.
4. UTF-8.
5. Windows-1252.
NFcCsnt|pg<00>|_|r|j|7_|pg|_|pg}tdd<02>|D<00><01>|_d|_||_d|_|<00>|<01>\|_ |_
dS)a<>Constructor.
:param markup: Some markup in an unknown encoding.
:param known_definite_encodings: When determining the encoding
of `markup`, these encodings will be tried first, in
order. In HTML terms, this corresponds to the "known
definite encoding" step defined here:
https://html.spec.whatwg.org/multipage/parsing.html#parsing-with-a-known-character-encoding
:param user_encodings: These encodings will be tried after the
`known_definite_encodings` have been tried and failed, and
after an attempt to sniff the encoding by looking at a
byte order mark has failed. In HTML terms, this
corresponds to the step "user has explicitly instructed
the user agent to override the document's character
encoding", defined here:
https://html.spec.whatwg.org/multipage/parsing.html#determining-the-character-encoding
:param override_encodings: A deprecated alias for
known_definite_encodings. Any encodings here will be tried
immediately after the encodings in
known_definite_encodings.
:param is_html: If True, this markup is considered to be
HTML. Otherwise it's assumed to be XML.
:param exclude_encodings: These encodings will not be tried,
even if they otherwise would be.
cSsg|]}|<01><00><00>qSr )<01>lowerrr r r rr z-EncodingDetector.__init__.<locals>.<listcomp>N) r*<00>known_definite_encodings<67>user_encodingsr"<00>exclude_encodings<67>chardet_encoding<6E>is_html<6D>declared_encoding<6E>strip_byte_order_mark<72>markup<75>sniffed_encoding)rUrsrlrprnrm<00>override_encodingsr r r <00>__init__Xs"
zEncodingDetector.__init__cCs8|dur|<01><00>}||jvrdS||vr|<02>|<01>dSdS)z<>Should we even bother to try this encoding?
:param encoding: Name of an encoding.
:param tried: Encodings that have already been tried. This will be modified
as a side effect.
NFT)rkrnr()rUr<00>triedr r r <00>_usable<6C>s

zEncodingDetector._usableccs<><00>t<00>}|jD] }|<00>||<01>r|Vq|<00>|j|<01>r|jV|jD] }|<00>||<01>r,|Vq!|jdur;|<00>|j|j<08>|_|<00>|j|<01>rF|jV|j durQt
|j<07>|_ |<00>|j |<01>r\|j VdD] }|<00>||<01>ri|Vq^dS)zmYield a number of encodings that might work for this markup.
:yield: A sequence of strings.
N)<02>utf-8<> windows-1252) r"rlrxrtrmrq<00>find_declared_encodingrsrpror)rUrw<00>er r r <00> encodings<67>s6<02>
 <02>
 <02>
<06>
  <02><04>zEncodingDetector.encodingscCsd}t|t<01>r ||fSt|<01>dkr-|dd<03>dkr-|dd<02>dkr-d}|dd<01>}||fSt|<01>dkrO|dd<03>dkrO|dd<02>dkrOd}|dd<01>}||fS|dd <09>d
krcd }|d d<01>}||fS|dd<02>d krwd }|dd<01>}||fS|dd<02>dkr<>d}|dd<01>}||fS)z<>If a byte-order mark is present, strip it and return the encoding it implies.
:param data: Some markup.
:return: A 2-tuple (modified data, implied encoding)
N<><00>s<00><>zzutf-16bes<00><>zutf-16le<6C>srys<00><>zutf-32bes<00><>zutf-32le)rrr&)rN<00>datarr r r rr<00>s6
<02> <1C><02> 
<10> <10> <10> z&EncodingDetector.strip_byte_order_markc Cs<>|r t|<01>}}n d}tdtt|<01>d<00><01>}t|t<04>r tt}ntt}|d}|d}d} |j||d<07>}
|
s@|r@|j||d<07>}
|
durJ|
<EFBFBD><08>d} | r[t| t<04>rW| <09> d d
<EFBFBD>} | <09>
<EFBFBD>SdS) a<>Given a document, tries to find its declared encoding.
An XML encoding is declared at the beginning of the document.
An HTML encoding is declared in a <meta> tag, hopefully near the
beginning of the document.
:param markup: Some markup.
:param is_html: If True, this markup is considered to be HTML. Otherwise
it's assumed to be XML.
:param search_entire_document: Since an encoding is supposed to declared near the beginning
of the document, most of the time it's only necessary to search a few kilobytes of data.
Set this to True to force this method to search the entire document.
iig<><67><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>?rrN)<01>endposrrrT) r&<00>max<61>intr<00>bytes<65> encoding_resr<00>search<63>groups<70>decoderk) rNrsrp<00>search_entire_document<6E>
xml_endpos<EFBFBD> html_endpos<6F>res<65>xml_re<72>html_rerq<00>declared_encoding_matchr r r r{<00>s(

 
 z'EncodingDetector.find_declared_encoding)NFNNN)FF) rdrerfrgrvrx<00>propertyr}rirrr{r r r r rj=s
<EFBFBD>/
+
rjc@s<>eZdZdZddd<04>Zgd<05>Zgddgddfdd <09>Zd
d <0B>Z<07>d<>d d<0E>Z<08>d<>dd<10>Z e
dd<12><00>Z dd<14>Z dd<16>Z idd<18>dd<1A>dd<1C>dd<1E>dd <20>d!d"<22>d#d$<24>d%d&<26>d'd(<28>d)d*<2A>d+d,<2C>d-d.<2E>d/d0<64>d1d2<64>d3d4<64>d5d2<64>d6d2<64>d7d8d9d:d;d<d=d>d?d@dAdBd2dCdDdE<64><0F>ZiddF<64>dd<1A>ddG<64>ddH<64>ddI<64>d!dJ<64>d#dK<64>d%dL<64>d'dM<64>d)dN<64>d+dO<64>d-dP<64>d/dQ<64>d1d2<64>d3dR<64>d5d2<64>d6d2<64>idSdT<64>dUdT<64>dVdW<64>dXdW<64>dYdZ<64>d[d\<5C>d]d^<5E>d_d`<60>dadb<64>dcdd<64>dedf<64>dgdh<64>did2<64>djdk<64>dldm<64>dnd<1A>dodp<64><01>idqdr<64>dsdt<64>dudv<64>dwdx<64>dydz<64>d{dO<64>d|d}<7D>d~d<64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>dp<64>d<>d<1A>d<>d<EFBFBD><64>d<>d\<5C>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64><01>id<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>dZ<64>d<>dG<64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d2<64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64><01>id<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64><01>id<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>dZ<64>d<>d<EFBFBD><64>d<>d<>d<>d<>d<>dm<64>d<>d<>d<>d<>d<>d<>d<>d<><01>id<>d<>dr<64>d<>d<>d<>d<>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64><01>d<>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD><64> <09>Zid<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<>d<EFBFBD><64>d<><64>d<00><01>d<01>d<02><01>d<03>d<04><01>d<05>d<06><01>d<07>d<08><01>d <09>d
<EFBFBD><01>d <0B>d <0C><01>d <0A>d<0E><01>d<0F>d<10>i<00>d<11>d<12><01>d<13>d<14><01>d<15>d<16><01>d<17>d<18><01>d<19>d<1A><01>d<1B>d<1C><01>d<1D>d<1E><01>d<1F>d <20><01>d!<21>d"<22><01>d#<23>d$<24><01>d%<25>d&<26><01>d'<27>d(<28><01>d)<29>d*<2A><01>d+<2B>d,<2C><01>d-<2D>d.<2E><01>d/<2F>d0<64><01>d1<64>d2<64><01>i<00>d3<64>d4<64><01>d5<64>d6<64><01>d7<64>d8<64><01>d9<64>d:<3A><01>d;<3B>d<<3C><01>d=<3D>d><3E><01>d?<3F>d@<40><01>dA<64>dB<64><01>dC<64>dD<64><01>dE<64>dF<64><01>dG<64>dH<64><01>dI<64>dJ<64><01>dK<64>dL<64><01>dM<64>dN<64><01>dO<64>dP<64><01>dQ<64>dR<64><01>dS<64>dT<64><01>i<00>dU<64>dV<64><01>dW<64>dX<64><01>dY<64>dZ<64><01>d[<5B>d\<5C><01>d]<5D>d^<5E><01>d_<64>d`<60><01>da<64>db<64><01>dc<64>dd<64><01>de<64>df<64><01>dg<64>dh<64><01>di<64>dj<64><01>dk<64>dl<64><01>dm<64>dn<64><01>do<64>dp<64><01>dq<64>dr<64><01>ds<64>dt<64><01>du<64>dv<64><01>i<00>dw<64>dx<64><01>dy<64>dz<64><01>d{<7B>d|<7C><01>d}<7D>d~<7E><01>d<64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>i<00>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<>do<64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>i<00>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<><64>d<><64><01>d<01><01><01><01><01><01><01>dА<01><01><01><01><01><01><01><03>Zg<00>d<><64>Ze<11>d<><00>d<>Ze<11>d<><00>d<>Ze<14> <09><> <09><>d<><64>d<><64>d<><64><01>ZdS(<28><00> UnicodeDammitz<74>A class for detecting the encoding of a *ML document and
converting it to a Unicode string. If the source encoding is
windows-1252, can replace MS smart quotes with their HTML or XML
equivalents.z mac-romanz shift-jis)<02> macintoshzx-sjis)rzz
iso-8859-1z
iso-8859-2NFc
Cs<>||_g|_d|_||_t<04>t<06>|_t||||||<07>|_ t
|t <0B>s%|dkr2||_ t |<01>|_ d|_dS|j j |_ d}|j jD]} |j j }|<00>| <09>}|durNnq=|sq|j jD]} | dkra|<00>| d<05>}|durp|j<07>d<06>d|_nqU||_ |s{d|_dSdS)a2Constructor.
:param markup: A bytestring representing markup in an unknown encoding.
:param known_definite_encodings: When determining the encoding
of `markup`, these encodings will be tried first, in
order. In HTML terms, this corresponds to the "known
definite encoding" step defined here:
https://html.spec.whatwg.org/multipage/parsing.html#parsing-with-a-known-character-encoding
:param user_encodings: These encodings will be tried after the
`known_definite_encodings` have been tried and failed, and
after an attempt to sniff the encoding by looking at a
byte order mark has failed. In HTML terms, this
corresponds to the step "user has explicitly instructed
the user agent to override the document's character
encoding", defined here:
https://html.spec.whatwg.org/multipage/parsing.html#determining-the-character-encoding
:param override_encodings: A deprecated alias for
known_definite_encodings. Any encodings here will be tried
immediately after the encodings in
known_definite_encodings.
:param smart_quotes_to: By default, Microsoft smart quotes will, like all other characters, be converted
to Unicode characters. Setting this to 'ascii' will convert them to ASCII quotes instead.
Setting it to 'xml' will convert them to XML entity references, and setting it to 'html'
will convert them to HTML entity references.
:param is_html: If True, this markup is considered to be HTML. Otherwise
it's assumed to be XML.
:param exclude_encodings: These encodings will not be considered, even
if the sniffing code thinks they might make sense.
FrNrrTzSSome characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.T)<12>smart_quotes_to<74>tried_encodings<67>contains_replacement_charactersrp<00>logging<6E> getLoggerrd<00>logrj<00>detectorrrrs<00>unicode_markup<75>original_encodingr}<00> _convert_from<6F>warning)
rUrsrlr<>rprnrmru<00>urr r r rvsJ& <06>

 
<02>  <04><02> 
<04>zUnicodeDammit.__init__cCs<>|<01>d<01>}|jdkr|j<02>|<02><01><04>}|S|j<05>|<02>}t|<03>tkrE|jdkr5d<04><04>|d<00><04>d<05><04>}|Sd<06><04>|d<00><04>d<05><04>}|S|<03><04>}|S)z[Changes a MS smart quote character to an XML or HTML
entity, or an ASCII character.rrrz&#xrrFr)rLr<><00>MS_CHARS_TO_ASCIIrK<00>encode<64>MS_CHARS<52>type<70>tuple)rU<00>match<63>origr\r r r <00> _sub_ms_charus


<0C> 
<1C><08>zUnicodeDammit._sub_ms_char<61>strictc
Cs<>|<00>|<01>}|r||f|jvrdS|j<01>||f<02>|j}|jdur3||jvr3d}t<06>|<04>}|<05>|j |<03>}z|<00>
|||<02>}||_||_ W|jSt yW}zWYd}~dSd}~ww)z|Attempt to convert the markup to the proposed encoding.
:param proposed: The name of a character encoding.
Ns([<5B>-<2D>])) <0A>
find_codecr<EFBFBD><00>appendrsr<><00>ENCODINGS_WITH_SMART_QUOTESr-r.r\r<><00> _to_unicoder<65><00> Exception)rU<00>proposed<65>errorsrs<00>smart_quotes_re<72>smart_quotes_compiledr<64>r|r r r r<><00>s(

<02>
<0E><08><02>zUnicodeDammit._convert_fromcCs t|||<03>S)z}Given a string and its encoding, decodes the string into Unicode.
:param encoding: The name of an encoding.
)r)rUr<>rr<>r r r r<><00>s zUnicodeDammit._to_unicodecCs|jsdS|jjS)zhIf the markup is an HTML document, returns the encoding declared _within_
the document.
N)rpr<>rq)rUr r r <00>declared_html_encoding<6E>sz$UnicodeDammit.declared_html_encodingcCs`|<00>|j<01>||<01><02>p'|o|<00>|<01>dd<02><02>p'|o|<00>|<01>dd<03><02>p'|o%|<01><04>p'|}|r.|<02><04>SdS)z<>Convert the name of a character set to a codec name.
:param charset: The name of a character set.
:return: The name of a codec.
<20>-r<00>_N)<05>_codec<65>CHARSET_ALIASESrKrTrk)rU<00>charsetrVr r r r<><00>s<02><02>
<02><02>zUnicodeDammit.find_codecc Cs:|s|Sd}z
t<00>|<01>|}W|SttfyY|Swr)<04>codecs<63>lookup<75> LookupError<6F>
ValueError)rUr<><00>codecr r r r<><00>s
<10><02>zUnicodeDammit._codec<65><00>)<02>euro<72>20AC<41><00><> <20><00>)<02>sbquo<75>201A<31><00>)<02>fnof<6F>192<39><00>)<02>bdquo<75>201E<31><00>)<02>hellip<69>2026<32><00>)<02>dagger<65>2020<32><00>)<02>Dagger<65>2021<32><00>)<02>circ<72>2C6<43><00>)<02>permil<69>2030<33><00>)<02>Scaron<6F>160<36><00>)<02>lsaquo<75>2039<33><00>)<02>OElig<69>152<35><00><>?<3F><00>)z#x17D<37>17D<37><00><><00>)<02>lsquo<75>2018)<02>rsquo<75>2019)<02>ldquo<75>201C)<02>rdquo<75>201D)<02>bull<6C>2022)<02>ndash<73>2013)<02>mdash<73>2014)<02>tilde<64>2DC)<02>trade<64>2122)<02>scaron<6F>161)<02>rsaquo<75>203A)<02>oelig<69>153)z#x17E<37>17E)<02>Yumlr)<0F><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><>EUR<55>,<2C>fz,,z...<2E>+z++<2B>^<5E>%<25>SrG<00>OE<4F>ZrrDrrrErr<00>*rr<>r z--r
<00>~r z(TM)r r r rHr<00>oerr<00>zr<00>Y<><00><><00><>!<21><00><>c<><00><>GBP<42><00><>$<24><00><>YEN<45><00>r!<00><00><><00>z..<2E><00>r<00><00>z(th)<29><00>z<<<3C><00><><00><><00>z(R)<29><00><><00><>o<><00>z+-<2D><00><>2<><00><>3<><00>)rD<00>acute<74><00>r<EFBFBD><00><00><>P<><00><><00><><00><>1<><00><><00>z>><3E><00>z1/4<><00>z1/2<><00>z3/4<><00><><00><>A<><00><><00><><00><><00><><00><><00><>AE<41><00><>C<><00><>E<><00><><00><><00><><00><>I<><00><><00><><00><><00><>D<><00><>N<><00><>O<><00><><00><><00><><00><><00><><00><><00><>U<><00><><00><><00><><00><><00><>b<><00><>B<><00><>a<><00><><00><><00><><00><><00><><00><>ae<61><00><><00>r|<00><00><><00><><00><><00><>i<><00><><00><><00><><00><><00><>n<><00><><00><><00><><00><><00><>/<2F>y) <09><00><><00><><00><><00><><00><><00><><00><><00><><00>rs<><E282AC>s<><E2809A>sƒ<><C692>s<><E2809E>s<><E280A6>s<><E280A0>s<><E280A1>sˆ<><CB86>s<><E280B0>sŠ<><C5A0>s<><E280B9>sŒ<><C592>sŽ<><C5BD>s<><E28098>s<><E28099>s<><E2809C>s<><E2809D>s<><E280A2>s<><E28093>s<><E28094>s˜<><CB9C>s<><E284A2>sš<><C5A1>s<><E280BA>sœ<><C593>sž<><C5BE>sŸ<><C5B8>s <><C2A0>s¡<><C2A1>s¢<><C2A2>s£<><C2A3>s¤<><C2A4>s¥<><C2A5>s¦<><C2A6>s§<><C2A7>s¨<><C2A8>s©<><C2A9>sª<><C2AA>s«<><C2AB>s¬<><C2AC>s­<><C2AD>s®<><C2AE>s¯<><C2AF>s°<><C2B0>s±<><C2B1>s²<><C2B2>s³<><C2B3>s´<><C2B4>sµ<><C2B5>s<><C2B6>s·<><C2B7>s¸<><C2B8>s¹<><C2B9>sº<><C2BA>s»<><C2BB>s¼<><C2BC>s½<><C2BD>s¾<><C2BE>s¿<><C2BF>sÀ<><C380>sÁ<><C381>sÂ<><C382>sÃ<><C383>sÄ<><C384>sÅ<><C385>sÆ<><C386>sÇ<><C387>sÈ<><C388>sÉ<><C389>sÊ<><C38A>sË<><C38B>sÌ<><C38C>sÍ<><C38D>sÎ<><C38E>sÏ<><C38F>sÐ<><C390>sÑ<><C391>sÒ<><C392>sÓ<><C393>sÔ<><C394>sÕ<><C395>sÖ<><C396>s×<><C397>sØ<><C398>sÙ<><C399>sÚ<><C39A>sÛ<><C39B>sÜ<><C39C>sÝ<><C39D>sÞ<><C39E>sß<><C39F>sà<><C3A0><00><>sâ<><C3A2>sã<><C3A3>sä<><C3A4>så<><C3A5>sæ<><C3A6>sç<><C3A7>sè<><C3A8>sé<><C3A9>sê<><C3AA>së<><C3AB>sì<><C3AC>sí<><C3AD>sî<><C3AE>sï<><C3AF>sð<><C3B0>sñ<><C3B1>sò<><C3B2>só<><C3B3>sô<><C3B4>sõ<><C3B5>sö<><C3B6>s÷<><C3B7>sø<><C3B8>sù<><C3B9>sú<><C3BA>sûsüsýsþ)<03><><00><><00><>))r<>r<>r)r<>rr<>)rr
r~rrr<00>utf8rzc Cs$|<03>dd<02><02><01>dvrtd<04><01>|<02><01>dvrtd<06><01>g}d}d}|t|<01>kr~||}t|t<05>s1t|<07>}||jkrS||jkrS|j D]\}} }
||krQ|| krQ||
7}nq>n%|dkrt||j
vrt|<04> |||<06><00>|<04> |j
|<00>|d 7}|}n|d 7}|t|<01>ks$|dkr<>|S|<04> ||d
<EFBFBD><00>d <0B> |<04>S) aFix characters from one encoding embedded in some other encoding.
Currently the only situation supported is Windows-1252 (or its
subset ISO-8859-1), embedded in UTF-8.
:param in_bytes: A bytestring that you suspect contains
characters from multiple encodings. Note that this _must_
be a bytestring. If you've already converted the document
to Unicode, you're too late.
:param main_encoding: The primary encoding of `in_bytes`.
:param embedded_encoding: The encoding that was used to embed characters
in the main document.
:return: A bytestring in which `embedded_encoding`
characters have been converted to their `main_encoding`
equivalents.
r<>r<>)rz<00> windows_1252zPWindows-1252 and ISO-8859-1 are the only currently supported embedded encodings.)rryz4UTF-8 is the only currently supported main encoding.rrrN<>) rTrk<00>NotImplementedErrorr&rr<>r<00>FIRST_MULTIBYTE_MARKER<45>LAST_MULTIBYTE_MARKER<45>MULTIBYTE_MARKERS_AND_SIZES<45>WINDOWS_1252_TO_UTF8r<38>r)) rN<00>in_bytes<65> main_encoding<6E>embedded_encoding<6E> byte_chunks<6B> chunk_start<72>pos<6F>byte<74>start<72>end<6E>sizer r r <00> detwinglesD<04> <04> 

<02><04> <0C>
zUnicodeDammit.detwingle)r<>)rrz)rdrerfrgr<>r<>rvr<>r<>r<>r<>r<>r<>r<>r<>r<>rrrrrir'r r r r r<>
sn <06>
<EFBFBD>X  
 <02><02><02><02><02><02><02><02> <02>
<02> <02> <02> <02><02><02><02><08>$<02><02><02><02><02><02><02><02> <02>
<02> <02> <02> <02><02><02><02><04><02><02><02><02><02><02><02><02><02><02><02><02><02><02> <02>!<02>"<06>#<02>$<02>%<02>'<02>(<02>)<02>*<02>+<02>,<02>-<02>.<02>/<02>0<02>1<02>2<02>3<02>4<06>5<02>6<02>7<02>8<02>9<02>:<02>;<02><<02>=<02>><02>?<02>@<02>A<02>B<02>C<02>D<02>E<06>F<02>G<02>H<02>I<02>J<02>K<02>L<02>M<02>N<02>O<02>P<02>Q<02>R<02>S<02>T<02>U<02>V<06>W<02>X<02>Y<02>Z<02>[<02>\<02>]<02>^<02>_<02>`<02>a<02>b<02>c<02>d<02>e<02>f<02>g<06>h<02>i<02>j<02>k<02>l<02>m<02>n<02>o<02>p<02>q<02>r<02>s<02>t<02>u<02>v<02>w<02>x<04>y<00><08>
<02><02><02><02><02><02><02><02> <02>
<02> <02> <02> <02><02><02><02><04><02><02><02><02><02><02><02><02><02><02><02><02><02><02> <02>!<02>"<06>#<02>$<02>%<02>&<02>'<02>(<02>)<02>*<02>+<02>,<02>-<02>.<02>/<02>0<02>1<02>2<02>3<06>4<02>5<02>6<02>7<02>8<02>9<02>:<02>;<02><<02>=<02>><02>?<02>@<02>A<02>B<02>C<02>D<06>E<02>F<02>G<02>H<02>I<02>J<02>K<02>L<02>M<02>N<02>O<02>P<02>Q<02>R<02>S<02>T<02>U<06>V<02>W<02>X<02>Y<02>Z<02>[<02>\<02>]<02>^<02>_<02>`<02>a<02>b<02>c<02>d<02>e<02>f<06>g<02>h<02>i<02>j<02>k<02>l<02>m<02>n<02>o<02>p<02>q<02>r<02>s<02>t<02>u<02>v<02>w<04>x
<EFBFBD>
}<16>r<EFBFBD>)rg<00> __license__<5F> html.entitiesr<00> collectionsrr<>r-r<><00>stringr<00>cchardet<65> ImportError<6F>chardet<65>charset_normalizerr<00> xml_encoding<6E> html_meta<74>dictr<74>r.r<>r\r<>rr<00>objectrrjr<>r r r r <00><module>sT        <02><04><02><04><02>

<EFBFBD>  
<EFBFBD> {N