§ Ç/xn@_äãó>—ddlZddlZddlmZGd„de¦«ZdS)éNé)ÚProbingStatecóš—eZdZdZdd„Zd„Zed„¦«Zd„Zed„¦«Z d„Z ed „¦«Zed „¦«Z ed„¦«ZdS) Ú CharSetProbergffffffî?Ncó^—d|_||_tjt¦«|_dS©N)Ú_stateÚlang_filterÚloggingÚ getLoggerÚ__name__Úlogger)Úselfr s ú“/builddir/build/BUILDROOT/alt-python311-pip-21.3.1-4.el9.x86_64/opt/alt/python311/lib/python3.11/site-packages/pip/_vendor/chardet/charsetprober.pyÚ__init__zCharSetProber.__init__'s'€ØˆŒØ&ˆÔÝÔ'Ñ1Ô1ˆŒˆˆócó(—tj|_dSr)rÚ DETECTINGr ©rs rÚresetzCharSetProber.reset,s€Ý"Ô,ˆŒˆˆrcó—dSr©rs rÚcharset_namezCharSetProber.charset_name/s€àˆtrcó—dSrr)rÚbufs rÚfeedzCharSetProber.feed3s€Øˆrcó—|jSr)r rs rÚstatezCharSetProber.state6s €àŒ{Ðrcó—dS)Ngrrs rÚget_confidencezCharSetProber.get_confidence:s€Øˆsrcó2—tjdd|¦«}|S)Ns([-])+ó )ÚreÚsub)rs rÚfilter_high_byte_onlyz#CharSetProber.filter_high_byte_only=s€åŒfÐ&¨¨cÑ2Ô2ˆØˆ rcó—t¦«}tjd|¦«}|D]Z}| |dd…¦«|dd…}| ¦«s|dkrd}| |¦«Œ[|S)u9 We define three types of bytes: alphabet: english alphabets [a-zA-Z] international: international characters [Â€-Ã¿] marker: everything else [^a-zA-ZÂ€-Ã¿] The input buffer can be thought to contain a series of words delimited by markers. This function works to filter all words that contain at least one international character. All contiguous sequences of markers are replaced by a single space ascii character. This filter applies to all scripts which do not use English characters. s%[a-zA-Z]*[€-ÿ]+[a-zA-Z]*[^a-zA-Z€-ÿ]?Néÿÿÿÿó€r")Ú bytearrayr#ÚfindallÚextendÚisalpha)rÚfilteredÚwordsÚwordÚ last_chars rÚfilter_international_wordsz(CharSetProber.filter_international_wordsBs€õ‘;”;ˆõ ” ÐOØñ ô ˆðð 'ð 'ˆDØOŠO˜D " œIÑ&Ô&Ð&ð˜R˜S˜Sœ ˆIØ×$Ò$Ñ&Ô&ð !¨9°wÒ+>Ð+>Ø ØOŠO˜IÑ&Ô&Ð&Ð&àˆrcó”—t¦«}d}d}tt|¦«¦«D]y}|||dz…}|dkrd}n|dkrd}|dkrS| ¦«s?||kr4|s2| |||…¦«| d¦«|dz}Œz|s| ||d …¦«|S) aÈ Returns a copy of ``buf`` that retains only the sequences of English alphabet and high byte characters that are not between <> characters. Also retains English alphabet and high byte characters immediately before occurrences of >. This filter can be applied to all scripts which contain both English characters and extended ASCII characters, but is currently only used by ``Latin1Prober``. Frró>órCsiðð:€€€Ø € € € àÐÐÐÐÐðnðnðnðnðnFñnônðnðnðnr