首页 > 解决方案 > 从asp.net页面抓取表格

问题描述

在此处输入图像描述来自asp.net页面的页面源如下所示:

<table class="tbl dentry1" style="margin-top:2vh;">   
                  <tr>
                    <td>Conv - A&nbsp;</td>
                    <td><input name="ctl00$ContentPlaceHolder1$tha" type="text" value="6" id="ctl00_ContentPlaceHolder1_tha" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$la" type="text" value="49" id="ctl00_ContentPlaceHolder1_la" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$ta" type="text" value="49" id="ctl00_ContentPlaceHolder1_ta" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$tta" type="text" value="7.30" id="ctl00_ContentPlaceHolder1_tta" /></td>
                    <td>Mixer-1</td>
                    <td><input name="ctl00$ContentPlaceHolder1$mb1" type="text" id="ctl00_ContentPlaceHolder1_mb1" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$ml1" type="text" id="ctl00_ContentPlaceHolder1_ml1" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$mr1" type="text" id="ctl00_ContentPlaceHolder1_mr1" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$mc1" type="text" id="ctl00_ContentPlaceHolder1_mc1" /></td>
                </tr>

                <tr>
                <td>Conv - B&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$thb" type="text" value="6" id="ctl00_ContentPlaceHolder1_thb" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$lb" type="text" value="793" id="ctl00_ContentPlaceHolder1_lb" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tb" type="text" value="58" id="ctl00_ContentPlaceHolder1_tb" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ttb" type="text" value="6.30" id="ctl00_ContentPlaceHolder1_ttb" /></td>
                <td>Mixer-2</td>
                <td><input name="ctl00$ContentPlaceHolder1$mb2" type="text" value="114" id="ctl00_ContentPlaceHolder1_mb2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ml2" type="text" value="2" id="ctl00_ContentPlaceHolder1_ml2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$mr2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mr2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$mc2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mc2" /></td>
            </tr>

            <tr>
                <td>Conv - C&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$thc" type="text" value="4" id="ctl00_ContentPlaceHolder1_thc" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$lc" type="text" value="1583" id="ctl00_ContentPlaceHolder1_lc" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tc" type="text" value="9" id="ctl00_ContentPlaceHolder1_tc" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ttc" type="text" value="11.00" id="ctl00_ContentPlaceHolder1_ttc" /></td>
                <td colspan="5">&nbsp;&nbsp;</td>
            </tr>

            <tr style="text-align:center;">
                <td></td>
                <td>Poured</td>
                <td>Waiting</td>
                <td></td>
                <td></td>
                <td></td>
                <td>Poured</td>
                <td>Waiting</td>
                <td>Tonnnage</td>
                <td></td>
            </tr>

            <tr>
                <td>Loads</td>
                <td><input name="ctl00$ContentPlaceHolder1$lp" type="text" value="2" id="ctl00_ContentPlaceHolder1_lp" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$lw" type="text" value="0" id="ctl00_ContentPlaceHolder1_lw" /></td>
                <td></td>
                <td></td>
                <td>Torpedos&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$tp" type="text" value="8" id="ctl00_ContentPlaceHolder1_tp" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tw" type="text" value="1" id="ctl00_ContentPlaceHolder1_tw" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tt" type="text" value="2407" id="ctl00_ContentPlaceHolder1_tt" /></td>
                <td></td>
            </tr>

            <tr>
                <td>Slag Yard Trips</td>
                <td><input name="ctl00$ContentPlaceHolder1$syt" type="text" value="21" id="ctl00_ContentPlaceHolder1_syt" /></td>
                <td></td>
                <td></td>
                <td></td>
                <td colspan="3">
                    Lance Jam Cut
                    <input name="ctl00$ContentPlaceHolder1$ljc" type="text" value="2" id="ctl00_ContentPlaceHolder1_ljc" />
                </td>
                <td></td>
                <td></td>
            </tr>

            <tr style="text-align:center;">
                <td></td>
                <td>Received</td>
                <td>Consumed</td>
                <td></td>
                <td></td>
                <td></td>
                <td>Used</td>
                <td>Successful</td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Scrap</td>
                <td><input name="ctl00$ContentPlaceHolder1$sr" type="text" value="323" id="ctl00_ContentPlaceHolder1_sr" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$sc" type="text" value="192" id="ctl00_ContentPlaceHolder1_sc" /></td>
                <td></td>
                <td></td>
                <td>Dart&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$du" type="text" value="8" id="ctl00_ContentPlaceHolder1_du" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ds" type="text" value="7" id="ctl00_ContentPlaceHolder1_ds" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <th colspan="10">Bin Position</th>
            </tr>

            <tr style="text-align:center;">
                <td></td>
                <td>Bin 1</td>
                <td>Bin 2</td>
                <td>Bin 3</td>
                <td>Bin 4</td>
                <td>Bin 5</td>
                <td>Bin 6</td>
                <td>Bin 7</td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Conv - A&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$b1a" type="text" value="15" id="ctl00_ContentPlaceHolder1_b1a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b2a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b2a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b3a" type="text" value="45" id="ctl00_ContentPlaceHolder1_b3a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b4a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b5a" type="text" value="50" id="ctl00_ContentPlaceHolder1_b5a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b6a" type="text" value="25" id="ctl00_ContentPlaceHolder1_b6a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b7a" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7a" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Conv - B&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$b1b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b1b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b2b" type="text" value="5" id="ctl00_ContentPlaceHolder1_b2b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b3b" type="text" value="40" id="ctl00_ContentPlaceHolder1_b3b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b4b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b5b" type="text" value="60" id="ctl00_ContentPlaceHolder1_b5b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b6b" type="text" value="15" id="ctl00_ContentPlaceHolder1_b6b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b7b" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7b" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Conv - C&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$b1c" type="text" id="ctl00_ContentPlaceHolder1_b1c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b2c" type="text" id="ctl00_ContentPlaceHolder1_b2c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b3c" type="text" value="50" id="ctl00_ContentPlaceHolder1_b3c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b4c" type="text" id="ctl00_ContentPlaceHolder1_b4c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b5c" type="text" value="45" id="ctl00_ContentPlaceHolder1_b5c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b6c" type="text" id="ctl00_ContentPlaceHolder1_b6c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b7c" type="text" value="35" id="ctl00_ContentPlaceHolder1_b7c" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Lime</td>
                <td><input name="ctl00$ContentPlaceHolder1$lime" type="text" value="155" id="ctl00_ContentPlaceHolder1_lime" /></td>
                <td style="text-align:center;">Dolo</td>
                <td><input name="ctl00$ContentPlaceHolder1$dolo" type="text" value="55" id="ctl00_ContentPlaceHolder1_dolo" /></td>
                <td style="text-align:center;">Ore</td>
                <td><input name="ctl00$ContentPlaceHolder1$ore" type="text" value="40" id="ctl00_ContentPlaceHolder1_ore" /></td>
                <td style="text-align:center;">Coke</td>
                <td><input name="ctl00$ContentPlaceHolder1$coke" type="text" id="ctl00_ContentPlaceHolder1_coke" /></td>
                <td style="text-align:center;">Raw Dolo</td>
                <td><input name="ctl00$ContentPlaceHolder1$rdolo" type="text" id="ctl00_ContentPlaceHolder1_rdolo" /></td>
            </tr>

            <tr>
                <td>Heats Recovered</td>
                <td><input name="ctl00$ContentPlaceHolder1$hr" type="text" value="16" id="ctl00_ContentPlaceHolder1_hr" /></td>
                <td style="text-align:center;" colspan="2">Amount Recovered</td>
                <td><input name="ctl00$ContentPlaceHolder1$ar" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ar" /></td>
                <td style="text-align:center;" colspan="2">Amount Export</td>
                <td><input name="ctl00$ContentPlaceHolder1$ae" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ae" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Holder Level</td>
                <td><input name="ctl00$ContentPlaceHolder1$hl" type="text" id="ctl00_ContentPlaceHolder1_hl" /></td>
                <td style="text-align:center;" colspan="2">Conv-C Recovery</td>
                <td><input name="ctl00$ContentPlaceHolder1$ccr" type="text" id="ctl00_ContentPlaceHolder1_ccr" /></td>
                <td></td>
                <td></td>
                <td></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Jam Pot Received</td>
                <td><input name="ctl00$ContentPlaceHolder1$jpr" type="text" value="2" id="ctl00_ContentPlaceHolder1_jpr" /></td>
                <td style="text-align:center;" colspan="2">Clear Pot Sent</td>
                <td><input name="ctl00$ContentPlaceHolder1$cps" type="text" value="2" id="ctl00_ContentPlaceHolder1_cps" /></td>
                <td style="text-align:center;" colspan="2">Ram Tree</td>
                <td><input name="ctl00$ContentPlaceHolder1$rt" type="text" id="ctl00_ContentPlaceHolder1_rt" /></td>
                <td></td>
                <td></td>
            </tr>
            
        </table>

        <table class="tbl dentry1" style="margin-top:2vh;">
            <caption>Delays & Injuries</caption>

            <tr>
                <th colspan="3" style="background-color:#09236b; color:#fff;">Delays</th>
            </tr>

            <tr>
                <th>Delay Time</th>
                <th>Agency</th>
                <th>Details</th>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt1" type="text" id="ctl00_ContentPlaceHolder1_dt1" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da1" type="text" id="ctl00_ContentPlaceHolder1_da1" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd1" type="text" value="LMB" id="ctl00_ContentPlaceHolder1_dd1" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt2" type="text" id="ctl00_ContentPlaceHolder1_dt2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da2" type="text" id="ctl00_ContentPlaceHolder1_da2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd2" type="text" id="ctl00_ContentPlaceHolder1_dd2" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt3" type="text" id="ctl00_ContentPlaceHolder1_dt3" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da3" type="text" id="ctl00_ContentPlaceHolder1_da3" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd3" type="text" value="Lime- 120 T       LHF=8.5 T       Dolo=62      I/O=  33T       LD Slag R=20         C=43 T" id="ctl00_ContentPlaceHolder1_dd3" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt4" type="text" id="ctl00_ContentPlaceHolder1_dt4" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da4" type="text" id="ctl00_ContentPlaceHolder1_da4" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd4" type="text" value="Si= 0.61       Basicity=  2.55        MgO= 12.64       SiO2=17.31" id="ctl00_ContentPlaceHolder1_dd4" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt5" type="text" id="ctl00_ContentPlaceHolder1_dt5" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da5" type="text" id="ctl00_ContentPlaceHolder1_da5" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd5" type="text" value="DS=0" id="ctl00_ContentPlaceHolder1_dd5" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt6" type="text" id="ctl00_ContentPlaceHolder1_dt6" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da6" type="text" id="ctl00_ContentPlaceHolder1_da6" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd6" type="text" value="A=  1335-3.90/1.33       B=  1373-1.83/2.33         C=1320-1.37/0.99" id="ctl00_ContentPlaceHolder1_dd6" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt7" type="text" id="ctl00_ContentPlaceHolder1_dt7" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da7" type="text" id="ctl00_ContentPlaceHolder1_da7" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd7" type="text" id="ctl00_ContentPlaceHolder1_dd7" style="width:25vw;" /></td>
            </tr>

            <tr>
                <th colspan="3">&nbsp;</th>
            </tr>

            <tr>
                <th colspan="3" style="background-color:#09236b; color:#fff;">Injuries</th>
            </tr>

            <tr>
                <th colspan="2">Name</th>
                <th>Details</th>
            </tr>

            <tr>
                <td colspan="2"><input name="ctl00$ContentPlaceHolder1$in1" type="text" id="ctl00_ContentPlaceHolder1_in1" style="width:10vw;" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$id1" type="text" id="ctl00_ContentPlaceHolder1_id1" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td colspan="2"><input name="ctl00$ContentPlaceHolder1$in2" type="text" id="ctl00_ContentPlaceHolder1_in2" style="width:10vw;" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$id2" type="text" id="ctl00_ContentPlaceHolder1_id2" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td colspan="2"><input name="ctl00$ContentPlaceHolder1$in3" type="text" id="ctl00_ContentPlaceHolder1_in3" style="width:10vw;" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$id3" type="text" id="ctl00_ContentPlaceHolder1_id3" style="width:25vw;" /></td>
            </tr>

            <tr>
                <th colspan="3" style="height:10.5vh;"><input type="hidden" name="ctl00$ContentPlaceHolder1$db_operation" id="ctl00_ContentPlaceHolder1_db_operation" value="update" />
                </th>
            </tr>

        </table>   

我需要从每个单元格中提取值。其中一些可能是空白的。

我试过的代码如下:

table = soup.find('table',{'class':'tbl dentry1'})
data = []
table_rows = table.find_all('tr')

l = []
for tr in table_rows:
    td = tr.find_all('td')
    row = [tr.text for tr in td]
    l.append(row)

print(l)

但它只打印:

[[], [], ['Conv - A\xa0', '', '', '', '', 'Mixer-1', '', '', '', ''], ['Conv - B\xa0', '', '', '', '', 'Mixer-2', '', '', '', ''], ['Conv - C\xa0', '', ' ', '', '', '\xa0\xa0'], ['', '倾倒', '等待', '', '', '', '倾倒', '等待', '吨位', ' '], ['Loads', '', '', '', '', 'Torpedos\xa0', '', '', '', ''], ['Slag Yard Trips', '', ' ', '', '', '\r\n Lance Jam Cut\r\n
\n', '', ''], ['', 'Received', 'Consumed', '', '', '', 'Used', 'Successful', '', ''], ['Scrap ', '', '', '', '', 'Dart\xa0', '', '', '', ''], [], ['', 'Bin 1', 'Bin 2', 'Bin 3', 'Bin 4', 'Bin 5', 'Bin 6', 'Bin 7', '', ''], ['Conv - A\xa0', '', '', '', '', '', '', '', '', ''], ['Conv - B\xa0', '', '', '', '', '', '', '', ' ', ''], ['Conv - C\xa0', '', '', '', '', '', '', '','', ''], ['Lime', '', 'Dolo', '', 'Ore', '', 'Coke', '', 'Raw Dolo', ''], ['Heats Recovered' , '', 'Amount Recovered', '', 'Amount Export', '', '', ''], ['Holder Level', '', 'Conv-C Recovery', '', '', ' ', '', '', ''], ['Jam Pot Received', '', 'Clear Pot Sent', '', 'Ram Tree', '', '', '']]'', '', ''], ['Holder Level', '', 'Conv-C Recovery', '', '', '', '', '', ''], ['Jam Pot Received ', '', '清除锅发送', '', 'Ram Tree', '', '', '']]'', '', ''], ['Holder Level', '', 'Conv-C Recovery', '', '', '', '', '', ''], ['Jam Pot Received ', '', '清除锅发送', '', 'Ram Tree', '', '', '']]

我哪里错了?

标签: pythonweb-scrapingbeautifulsoup

解决方案


您可以简单地使用pandas

import pandas as pd
html = """
<table class="tbl dentry1" style="margin-top:2vh;">   
                  <tr>
                    <td>Conv - A&nbsp;</td>
                    <td><input name="ctl00$ContentPlaceHolder1$tha" type="text" value="6" id="ctl00_ContentPlaceHolder1_tha" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$la" type="text" value="49" id="ctl00_ContentPlaceHolder1_la" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$ta" type="text" value="49" id="ctl00_ContentPlaceHolder1_ta" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$tta" type="text" value="7.30" id="ctl00_ContentPlaceHolder1_tta" /></td>
                    <td>Mixer-1</td>
                    <td><input name="ctl00$ContentPlaceHolder1$mb1" type="text" id="ctl00_ContentPlaceHolder1_mb1" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$ml1" type="text" id="ctl00_ContentPlaceHolder1_ml1" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$mr1" type="text" id="ctl00_ContentPlaceHolder1_mr1" /></td>
                    <td><input name="ctl00$ContentPlaceHolder1$mc1" type="text" id="ctl00_ContentPlaceHolder1_mc1" /></td>
                </tr>

                <tr>
                <td>Conv - B&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$thb" type="text" value="6" id="ctl00_ContentPlaceHolder1_thb" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$lb" type="text" value="793" id="ctl00_ContentPlaceHolder1_lb" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tb" type="text" value="58" id="ctl00_ContentPlaceHolder1_tb" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ttb" type="text" value="6.30" id="ctl00_ContentPlaceHolder1_ttb" /></td>
                <td>Mixer-2</td>
                <td><input name="ctl00$ContentPlaceHolder1$mb2" type="text" value="114" id="ctl00_ContentPlaceHolder1_mb2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ml2" type="text" value="2" id="ctl00_ContentPlaceHolder1_ml2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$mr2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mr2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$mc2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mc2" /></td>
            </tr>

            <tr>
                <td>Conv - C&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$thc" type="text" value="4" id="ctl00_ContentPlaceHolder1_thc" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$lc" type="text" value="1583" id="ctl00_ContentPlaceHolder1_lc" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tc" type="text" value="9" id="ctl00_ContentPlaceHolder1_tc" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ttc" type="text" value="11.00" id="ctl00_ContentPlaceHolder1_ttc" /></td>
                <td colspan="5">&nbsp;&nbsp;</td>
            </tr>

            <tr style="text-align:center;">
                <td></td>
                <td>Poured</td>
                <td>Waiting</td>
                <td></td>
                <td></td>
                <td></td>
                <td>Poured</td>
                <td>Waiting</td>
                <td>Tonnnage</td>
                <td></td>
            </tr>

            <tr>
                <td>Loads</td>
                <td><input name="ctl00$ContentPlaceHolder1$lp" type="text" value="2" id="ctl00_ContentPlaceHolder1_lp" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$lw" type="text" value="0" id="ctl00_ContentPlaceHolder1_lw" /></td>
                <td></td>
                <td></td>
                <td>Torpedos&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$tp" type="text" value="8" id="ctl00_ContentPlaceHolder1_tp" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tw" type="text" value="1" id="ctl00_ContentPlaceHolder1_tw" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$tt" type="text" value="2407" id="ctl00_ContentPlaceHolder1_tt" /></td>
                <td></td>
            </tr>

            <tr>
                <td>Slag Yard Trips</td>
                <td><input name="ctl00$ContentPlaceHolder1$syt" type="text" value="21" id="ctl00_ContentPlaceHolder1_syt" /></td>
                <td></td>
                <td></td>
                <td></td>
                <td colspan="3">
                    Lance Jam Cut
                    <input name="ctl00$ContentPlaceHolder1$ljc" type="text" value="2" id="ctl00_ContentPlaceHolder1_ljc" />
                </td>
                <td></td>
                <td></td>
            </tr>

            <tr style="text-align:center;">
                <td></td>
                <td>Received</td>
                <td>Consumed</td>
                <td></td>
                <td></td>
                <td></td>
                <td>Used</td>
                <td>Successful</td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Scrap</td>
                <td><input name="ctl00$ContentPlaceHolder1$sr" type="text" value="323" id="ctl00_ContentPlaceHolder1_sr" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$sc" type="text" value="192" id="ctl00_ContentPlaceHolder1_sc" /></td>
                <td></td>
                <td></td>
                <td>Dart&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$du" type="text" value="8" id="ctl00_ContentPlaceHolder1_du" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$ds" type="text" value="7" id="ctl00_ContentPlaceHolder1_ds" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <th colspan="10">Bin Position</th>
            </tr>

            <tr style="text-align:center;">
                <td></td>
                <td>Bin 1</td>
                <td>Bin 2</td>
                <td>Bin 3</td>
                <td>Bin 4</td>
                <td>Bin 5</td>
                <td>Bin 6</td>
                <td>Bin 7</td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Conv - A&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$b1a" type="text" value="15" id="ctl00_ContentPlaceHolder1_b1a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b2a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b2a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b3a" type="text" value="45" id="ctl00_ContentPlaceHolder1_b3a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b4a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b5a" type="text" value="50" id="ctl00_ContentPlaceHolder1_b5a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b6a" type="text" value="25" id="ctl00_ContentPlaceHolder1_b6a" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b7a" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7a" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Conv - B&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$b1b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b1b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b2b" type="text" value="5" id="ctl00_ContentPlaceHolder1_b2b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b3b" type="text" value="40" id="ctl00_ContentPlaceHolder1_b3b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b4b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b5b" type="text" value="60" id="ctl00_ContentPlaceHolder1_b5b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b6b" type="text" value="15" id="ctl00_ContentPlaceHolder1_b6b" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b7b" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7b" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Conv - C&nbsp;</td>
                <td><input name="ctl00$ContentPlaceHolder1$b1c" type="text" id="ctl00_ContentPlaceHolder1_b1c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b2c" type="text" id="ctl00_ContentPlaceHolder1_b2c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b3c" type="text" value="50" id="ctl00_ContentPlaceHolder1_b3c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b4c" type="text" id="ctl00_ContentPlaceHolder1_b4c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b5c" type="text" value="45" id="ctl00_ContentPlaceHolder1_b5c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b6c" type="text" id="ctl00_ContentPlaceHolder1_b6c" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$b7c" type="text" value="35" id="ctl00_ContentPlaceHolder1_b7c" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Lime</td>
                <td><input name="ctl00$ContentPlaceHolder1$lime" type="text" value="155" id="ctl00_ContentPlaceHolder1_lime" /></td>
                <td style="text-align:center;">Dolo</td>
                <td><input name="ctl00$ContentPlaceHolder1$dolo" type="text" value="55" id="ctl00_ContentPlaceHolder1_dolo" /></td>
                <td style="text-align:center;">Ore</td>
                <td><input name="ctl00$ContentPlaceHolder1$ore" type="text" value="40" id="ctl00_ContentPlaceHolder1_ore" /></td>
                <td style="text-align:center;">Coke</td>
                <td><input name="ctl00$ContentPlaceHolder1$coke" type="text" id="ctl00_ContentPlaceHolder1_coke" /></td>
                <td style="text-align:center;">Raw Dolo</td>
                <td><input name="ctl00$ContentPlaceHolder1$rdolo" type="text" id="ctl00_ContentPlaceHolder1_rdolo" /></td>
            </tr>

            <tr>
                <td>Heats Recovered</td>
                <td><input name="ctl00$ContentPlaceHolder1$hr" type="text" value="16" id="ctl00_ContentPlaceHolder1_hr" /></td>
                <td style="text-align:center;" colspan="2">Amount Recovered</td>
                <td><input name="ctl00$ContentPlaceHolder1$ar" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ar" /></td>
                <td style="text-align:center;" colspan="2">Amount Export</td>
                <td><input name="ctl00$ContentPlaceHolder1$ae" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ae" /></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Holder Level</td>
                <td><input name="ctl00$ContentPlaceHolder1$hl" type="text" id="ctl00_ContentPlaceHolder1_hl" /></td>
                <td style="text-align:center;" colspan="2">Conv-C Recovery</td>
                <td><input name="ctl00$ContentPlaceHolder1$ccr" type="text" id="ctl00_ContentPlaceHolder1_ccr" /></td>
                <td></td>
                <td></td>
                <td></td>
                <td></td>
                <td></td>
            </tr>

            <tr>
                <td>Jam Pot Received</td>
                <td><input name="ctl00$ContentPlaceHolder1$jpr" type="text" value="2" id="ctl00_ContentPlaceHolder1_jpr" /></td>
                <td style="text-align:center;" colspan="2">Clear Pot Sent</td>
                <td><input name="ctl00$ContentPlaceHolder1$cps" type="text" value="2" id="ctl00_ContentPlaceHolder1_cps" /></td>
                <td style="text-align:center;" colspan="2">Ram Tree</td>
                <td><input name="ctl00$ContentPlaceHolder1$rt" type="text" id="ctl00_ContentPlaceHolder1_rt" /></td>
                <td></td>
                <td></td>
            </tr>
            
        </table>

        <table class="tbl dentry1" style="margin-top:2vh;">
            <caption>Delays & Injuries</caption>

            <tr>
                <th colspan="3" style="background-color:#09236b; color:#fff;">Delays</th>
            </tr>

            <tr>
                <th>Delay Time</th>
                <th>Agency</th>
                <th>Details</th>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt1" type="text" id="ctl00_ContentPlaceHolder1_dt1" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da1" type="text" id="ctl00_ContentPlaceHolder1_da1" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd1" type="text" value="LMB" id="ctl00_ContentPlaceHolder1_dd1" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt2" type="text" id="ctl00_ContentPlaceHolder1_dt2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da2" type="text" id="ctl00_ContentPlaceHolder1_da2" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd2" type="text" id="ctl00_ContentPlaceHolder1_dd2" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt3" type="text" id="ctl00_ContentPlaceHolder1_dt3" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da3" type="text" id="ctl00_ContentPlaceHolder1_da3" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd3" type="text" value="Lime- 120 T       LHF=8.5 T       Dolo=62      I/O=  33T       LD Slag R=20         C=43 T" id="ctl00_ContentPlaceHolder1_dd3" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt4" type="text" id="ctl00_ContentPlaceHolder1_dt4" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da4" type="text" id="ctl00_ContentPlaceHolder1_da4" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd4" type="text" value="Si= 0.61       Basicity=  2.55        MgO= 12.64       SiO2=17.31" id="ctl00_ContentPlaceHolder1_dd4" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt5" type="text" id="ctl00_ContentPlaceHolder1_dt5" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da5" type="text" id="ctl00_ContentPlaceHolder1_da5" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd5" type="text" value="DS=0" id="ctl00_ContentPlaceHolder1_dd5" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt6" type="text" id="ctl00_ContentPlaceHolder1_dt6" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da6" type="text" id="ctl00_ContentPlaceHolder1_da6" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd6" type="text" value="A=  1335-3.90/1.33       B=  1373-1.83/2.33         C=1320-1.37/0.99" id="ctl00_ContentPlaceHolder1_dd6" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td><input name="ctl00$ContentPlaceHolder1$dt7" type="text" id="ctl00_ContentPlaceHolder1_dt7" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$da7" type="text" id="ctl00_ContentPlaceHolder1_da7" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$dd7" type="text" id="ctl00_ContentPlaceHolder1_dd7" style="width:25vw;" /></td>
            </tr>

            <tr>
                <th colspan="3">&nbsp;</th>
            </tr>

            <tr>
                <th colspan="3" style="background-color:#09236b; color:#fff;">Injuries</th>
            </tr>

            <tr>
                <th colspan="2">Name</th>
                <th>Details</th>
            </tr>

            <tr>
                <td colspan="2"><input name="ctl00$ContentPlaceHolder1$in1" type="text" id="ctl00_ContentPlaceHolder1_in1" style="width:10vw;" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$id1" type="text" id="ctl00_ContentPlaceHolder1_id1" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td colspan="2"><input name="ctl00$ContentPlaceHolder1$in2" type="text" id="ctl00_ContentPlaceHolder1_in2" style="width:10vw;" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$id2" type="text" id="ctl00_ContentPlaceHolder1_id2" style="width:25vw;" /></td>
            </tr>

            <tr>
                <td colspan="2"><input name="ctl00$ContentPlaceHolder1$in3" type="text" id="ctl00_ContentPlaceHolder1_in3" style="width:10vw;" /></td>
                <td><input name="ctl00$ContentPlaceHolder1$id3" type="text" id="ctl00_ContentPlaceHolder1_id3" style="width:25vw;" /></td>
            </tr>

            <tr>
                <th colspan="3" style="height:10.5vh;"><input type="hidden" name="ctl00$ContentPlaceHolder1$db_operation" id="ctl00_ContentPlaceHolder1_db_operation" value="update" />
                </th>
            </tr>

        </table>   
"""

dfs = pd.read_html(html)

dfs[0].to_csv("D:\\Table_1.csv", index = False)

截图Table_1.csv

在此处输入图像描述

编辑::

您无法value使用此方法获取属性。为此,您必须使用BeautifulSoup. 这是完整的代码:

from bs4 import BeautifulSoup
import pandas as pd

html = "Your HTML"

soup = BeautifulSoup(html,'html5lib')

table = soup.find('table', class_ = "tbl dentry1")

tr_tags = table.find_all('tr')

final = {}
final = []
for tr in tr_tags:
    lst = []
    td_tags = tr.find_all('td')
    for td in td_tags:
        if td.input:
            if td.input.has_attr('value'):
                lst.append(td.input['value'])
        elif td.text != "":
            lst.append(td.text.replace('\xa0',''))
        else:
            lst.append('')

    for x in range(10 - len(lst)):
        lst.append("")

    final.append(lst)

columns = [f'Column {x+1}' for x in range(10)]

df = pd.DataFrame(final,columns=columns)

df.to_csv("D:\\Table_1.csv", index = False, encoding='utf-8')

截图Table_1.csv

在此处输入图像描述


推荐阅读