首页 > 解决方案 > 将网页保存到文本文件时出现编码错误

问题描述

from selenium import webdriver
import requests

linkElems = 0
browser = webdriver.Edge \
(executable_path=r'C:\Users\Admin\Downloads\edgedriver_win64\msedgedriver.exe')
lists = [r'http://rockfordcityil.iqm2.com/Citizens/calendar.aspx']
             
def rockford(lists):
    browser.get(lists[0])
    browser.find_element_by_css_selector('#ContentPlaceholder1_pnlMeetings \
    > div:nth-child(5) > div.RowTop > div.RowLink > a').click()
    browser.find_element_by_css_selector('#ContentPlaceholder1_hlPublicAgendaFile').click()
    switch_windows(browser.current_url)
    write_file(browser.current_url, 'Rockford.txt')

def switch_windows(url):
    original_window = browser.current_window_handle
    for window_handle in browser.window_handles:
        if window_handle != original_window:
            browser.switch_to_window(window_handle)
            
            
def write_file(url,fileName):
   res = requests.get(url)
   playFile = open(fileName , 'wb')
   for chunk in res.iter_content(100000):
       playFile.write(chunk)
 

罗克福德(列表)

代码正在正确执行,但是当我在工作目录中打开文本文件时,该文件不可读。以下是其打开方式的片段。

%PDF-1.5
%µµµµ
1 0 obj
<</Type/Catalog/Pages 2 0 R/Lang(en-US) /StructTreeRoot 36 0 R/MarkInfo<</Marked true>>>>
endobj
2 0 obj
<</Type/Pages/Count 8/Kids[ 3 0 R 21 0 R 23 0 R 25 0 R 27 0 R 29 0 R 31 0 R 33 0 R] >>
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources<</Font<</F1 5 0 R/F2 9 0 R/F3 11 0 R/F4 13 0 R/F5 15 0 R/F6 19 0 R>>/ExtGState<</GS7 7 0 R/GS8 8 0 R>>/XObject<</Image17 17 0 R>>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/Annots[ 18 0 R] /MediaBox[ 0 0 612 792] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>>
endobj
4 0 obj
<</Filter/FlateDecode/Length 3083>>
stream
xœÅ]Sã¶öþƒÞêt@±$ît:“
»]v–…ÜöaÛ;‰‡ÄNmJúÛï9Gv ‰
Mw™ÁØ–¬óý¡£CPTé8U짟úƒªŠFÓ$fßú×ùâþõÃ"é_D“4‹ª4ÏúWË›
_}J¢8)~þ™½?²?lnãO($³™W?”¬H~û‘e‡ï¯ú.=v=><0Ïf‚ù’ÛÒa¾#8ü¹žÃ¼_®|6)aM6¡§ ~úåðà›Åz°ëχ`Åÿ¼´tC®ü§  b
h˜V½c!­‡žp-6ìûV¾ì++¥3sH(WrßíBbgjwÃF¶ŠBzÜ    ^Jé+îû]Pþé9 éXkГҚ$½cie±!ØJ—Ã.ØJ—ý³,ÉÂuP*àN'™{2S±€ÛÎØ©‚.8 ¶®•V¬'ë8›õ\ë^ëªwìXI”|„lÎ{¾•ÁoÌð«p7Ë{!½.Œáëz‚‡æ”ìuÈ(ùÂãò-¤lÛîÛÃqüô\æ=ak1ßö<k: åêZñÉûô<32¼x8U n¼³ñâ˜Ó×åŽ9æïd§­Z`+îÈ·óyu¤i‡r™L–³eP°³™O—*Í&ûú¢më0c”Î6-ô|"ëpð}1iî.    â”­&quot;*ñ²x×Gõì߃÷¿‡üåE­£íJ›Îø$ïÖ<ô
z§ ñ~ýÿ¿†z¡B‹ùþ¦â62•F›¦¢€N·ÊYO
8¦+ÆèÁÍæfGì3¨Y.ñ‹æÀôqž%áΆ_‰£ÂÆʸçãË«PñxàR
Zø¦xÅmÑÅÞ¾sÐ^ñŽ]àÝÙ~å“!—Ê0…-Jî
Ÿ{æ8ùz‹s=îÁƒ}8²~Ç~ï}^Uù¼{Ë÷1Ï«·|îFZ\ï1¤MtnôK&œ¡¡&h¹”®V  Ù9Ó鈦ëY}ÕCûÅ;  w¶¶m_ß0%4H—w=¡,…wÁjt°§RÊMs¡à~ÐAŽq   í¶#gš'lå¶ àha éðl»‰^W‚¦p~:&‰ðÙIδXÿ:žž0{G=Øâ\­ž£ÐÖ6Yw=M@®lœÏfù=æA!j@h   HyQ4*R&²ª¤úK36” ¡ ³k-‚}Ì£<cšèíeI…cM‹&lt;ËaØåà†s‚š‘Ž0˜`à˜±¼ˆ‹‚åoÌ ®gå#Œ¸þÆi¦¿-Y+UOÁQe©ˆ  DI{ÕÁ:I    Ë–¶rË)LgˆNˆ”ÏçøƒªJ'™šÌ9õS•R}ƒQ¾cÃ&†ÎhVåì¼€©1ñUGÔS´®ì.ox­–ɳ# ÀÞ$ž a³|ÌÀO½Z‚‹k–GÈrÂ7¡HM_-of´¾fW”$&õú·
ÓñkÓL…‰½ÍT›Å-"ÝÛ¤–P'=qé’¢Bº‹¸]Tlðj˨…)£¶%—»ƒ—†À»AЖ*·ª9›6Žb³;@°ß+¿ÁË_8‰·

标签: python-3.xencoding

解决方案


您正在尝试保存pdf文件,而不是txt文件,因此请更改:

write_file(browser.current_url, 'Rockford.txt')

进入这个:

write_file(browser.current_url, 'Rockford.pdf')

推荐阅读