python - 从asp.net页面抓取表格
问题描述
<table class="tbl dentry1" style="margin-top:2vh;">
<tr>
<td>Conv - A </td>
<td><input name="ctl00$ContentPlaceHolder1$tha" type="text" value="6" id="ctl00_ContentPlaceHolder1_tha" /></td>
<td><input name="ctl00$ContentPlaceHolder1$la" type="text" value="49" id="ctl00_ContentPlaceHolder1_la" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ta" type="text" value="49" id="ctl00_ContentPlaceHolder1_ta" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tta" type="text" value="7.30" id="ctl00_ContentPlaceHolder1_tta" /></td>
<td>Mixer-1</td>
<td><input name="ctl00$ContentPlaceHolder1$mb1" type="text" id="ctl00_ContentPlaceHolder1_mb1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ml1" type="text" id="ctl00_ContentPlaceHolder1_ml1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mr1" type="text" id="ctl00_ContentPlaceHolder1_mr1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mc1" type="text" id="ctl00_ContentPlaceHolder1_mc1" /></td>
</tr>
<tr>
<td>Conv - B </td>
<td><input name="ctl00$ContentPlaceHolder1$thb" type="text" value="6" id="ctl00_ContentPlaceHolder1_thb" /></td>
<td><input name="ctl00$ContentPlaceHolder1$lb" type="text" value="793" id="ctl00_ContentPlaceHolder1_lb" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tb" type="text" value="58" id="ctl00_ContentPlaceHolder1_tb" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ttb" type="text" value="6.30" id="ctl00_ContentPlaceHolder1_ttb" /></td>
<td>Mixer-2</td>
<td><input name="ctl00$ContentPlaceHolder1$mb2" type="text" value="114" id="ctl00_ContentPlaceHolder1_mb2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ml2" type="text" value="2" id="ctl00_ContentPlaceHolder1_ml2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mr2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mr2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mc2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mc2" /></td>
</tr>
<tr>
<td>Conv - C </td>
<td><input name="ctl00$ContentPlaceHolder1$thc" type="text" value="4" id="ctl00_ContentPlaceHolder1_thc" /></td>
<td><input name="ctl00$ContentPlaceHolder1$lc" type="text" value="1583" id="ctl00_ContentPlaceHolder1_lc" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tc" type="text" value="9" id="ctl00_ContentPlaceHolder1_tc" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ttc" type="text" value="11.00" id="ctl00_ContentPlaceHolder1_ttc" /></td>
<td colspan="5"> </td>
</tr>
<tr style="text-align:center;">
<td></td>
<td>Poured</td>
<td>Waiting</td>
<td></td>
<td></td>
<td></td>
<td>Poured</td>
<td>Waiting</td>
<td>Tonnnage</td>
<td></td>
</tr>
<tr>
<td>Loads</td>
<td><input name="ctl00$ContentPlaceHolder1$lp" type="text" value="2" id="ctl00_ContentPlaceHolder1_lp" /></td>
<td><input name="ctl00$ContentPlaceHolder1$lw" type="text" value="0" id="ctl00_ContentPlaceHolder1_lw" /></td>
<td></td>
<td></td>
<td>Torpedos </td>
<td><input name="ctl00$ContentPlaceHolder1$tp" type="text" value="8" id="ctl00_ContentPlaceHolder1_tp" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tw" type="text" value="1" id="ctl00_ContentPlaceHolder1_tw" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tt" type="text" value="2407" id="ctl00_ContentPlaceHolder1_tt" /></td>
<td></td>
</tr>
<tr>
<td>Slag Yard Trips</td>
<td><input name="ctl00$ContentPlaceHolder1$syt" type="text" value="21" id="ctl00_ContentPlaceHolder1_syt" /></td>
<td></td>
<td></td>
<td></td>
<td colspan="3">
Lance Jam Cut
<input name="ctl00$ContentPlaceHolder1$ljc" type="text" value="2" id="ctl00_ContentPlaceHolder1_ljc" />
</td>
<td></td>
<td></td>
</tr>
<tr style="text-align:center;">
<td></td>
<td>Received</td>
<td>Consumed</td>
<td></td>
<td></td>
<td></td>
<td>Used</td>
<td>Successful</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Scrap</td>
<td><input name="ctl00$ContentPlaceHolder1$sr" type="text" value="323" id="ctl00_ContentPlaceHolder1_sr" /></td>
<td><input name="ctl00$ContentPlaceHolder1$sc" type="text" value="192" id="ctl00_ContentPlaceHolder1_sc" /></td>
<td></td>
<td></td>
<td>Dart </td>
<td><input name="ctl00$ContentPlaceHolder1$du" type="text" value="8" id="ctl00_ContentPlaceHolder1_du" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ds" type="text" value="7" id="ctl00_ContentPlaceHolder1_ds" /></td>
<td></td>
<td></td>
</tr>
<tr>
<th colspan="10">Bin Position</th>
</tr>
<tr style="text-align:center;">
<td></td>
<td>Bin 1</td>
<td>Bin 2</td>
<td>Bin 3</td>
<td>Bin 4</td>
<td>Bin 5</td>
<td>Bin 6</td>
<td>Bin 7</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conv - A </td>
<td><input name="ctl00$ContentPlaceHolder1$b1a" type="text" value="15" id="ctl00_ContentPlaceHolder1_b1a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b2a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b2a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b3a" type="text" value="45" id="ctl00_ContentPlaceHolder1_b3a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b4a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b5a" type="text" value="50" id="ctl00_ContentPlaceHolder1_b5a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b6a" type="text" value="25" id="ctl00_ContentPlaceHolder1_b6a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b7a" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7a" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conv - B </td>
<td><input name="ctl00$ContentPlaceHolder1$b1b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b1b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b2b" type="text" value="5" id="ctl00_ContentPlaceHolder1_b2b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b3b" type="text" value="40" id="ctl00_ContentPlaceHolder1_b3b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b4b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b5b" type="text" value="60" id="ctl00_ContentPlaceHolder1_b5b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b6b" type="text" value="15" id="ctl00_ContentPlaceHolder1_b6b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b7b" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7b" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conv - C </td>
<td><input name="ctl00$ContentPlaceHolder1$b1c" type="text" id="ctl00_ContentPlaceHolder1_b1c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b2c" type="text" id="ctl00_ContentPlaceHolder1_b2c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b3c" type="text" value="50" id="ctl00_ContentPlaceHolder1_b3c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b4c" type="text" id="ctl00_ContentPlaceHolder1_b4c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b5c" type="text" value="45" id="ctl00_ContentPlaceHolder1_b5c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b6c" type="text" id="ctl00_ContentPlaceHolder1_b6c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b7c" type="text" value="35" id="ctl00_ContentPlaceHolder1_b7c" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Lime</td>
<td><input name="ctl00$ContentPlaceHolder1$lime" type="text" value="155" id="ctl00_ContentPlaceHolder1_lime" /></td>
<td style="text-align:center;">Dolo</td>
<td><input name="ctl00$ContentPlaceHolder1$dolo" type="text" value="55" id="ctl00_ContentPlaceHolder1_dolo" /></td>
<td style="text-align:center;">Ore</td>
<td><input name="ctl00$ContentPlaceHolder1$ore" type="text" value="40" id="ctl00_ContentPlaceHolder1_ore" /></td>
<td style="text-align:center;">Coke</td>
<td><input name="ctl00$ContentPlaceHolder1$coke" type="text" id="ctl00_ContentPlaceHolder1_coke" /></td>
<td style="text-align:center;">Raw Dolo</td>
<td><input name="ctl00$ContentPlaceHolder1$rdolo" type="text" id="ctl00_ContentPlaceHolder1_rdolo" /></td>
</tr>
<tr>
<td>Heats Recovered</td>
<td><input name="ctl00$ContentPlaceHolder1$hr" type="text" value="16" id="ctl00_ContentPlaceHolder1_hr" /></td>
<td style="text-align:center;" colspan="2">Amount Recovered</td>
<td><input name="ctl00$ContentPlaceHolder1$ar" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ar" /></td>
<td style="text-align:center;" colspan="2">Amount Export</td>
<td><input name="ctl00$ContentPlaceHolder1$ae" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ae" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Holder Level</td>
<td><input name="ctl00$ContentPlaceHolder1$hl" type="text" id="ctl00_ContentPlaceHolder1_hl" /></td>
<td style="text-align:center;" colspan="2">Conv-C Recovery</td>
<td><input name="ctl00$ContentPlaceHolder1$ccr" type="text" id="ctl00_ContentPlaceHolder1_ccr" /></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Jam Pot Received</td>
<td><input name="ctl00$ContentPlaceHolder1$jpr" type="text" value="2" id="ctl00_ContentPlaceHolder1_jpr" /></td>
<td style="text-align:center;" colspan="2">Clear Pot Sent</td>
<td><input name="ctl00$ContentPlaceHolder1$cps" type="text" value="2" id="ctl00_ContentPlaceHolder1_cps" /></td>
<td style="text-align:center;" colspan="2">Ram Tree</td>
<td><input name="ctl00$ContentPlaceHolder1$rt" type="text" id="ctl00_ContentPlaceHolder1_rt" /></td>
<td></td>
<td></td>
</tr>
</table>
<table class="tbl dentry1" style="margin-top:2vh;">
<caption>Delays & Injuries</caption>
<tr>
<th colspan="3" style="background-color:#09236b; color:#fff;">Delays</th>
</tr>
<tr>
<th>Delay Time</th>
<th>Agency</th>
<th>Details</th>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt1" type="text" id="ctl00_ContentPlaceHolder1_dt1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da1" type="text" id="ctl00_ContentPlaceHolder1_da1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd1" type="text" value="LMB" id="ctl00_ContentPlaceHolder1_dd1" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt2" type="text" id="ctl00_ContentPlaceHolder1_dt2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da2" type="text" id="ctl00_ContentPlaceHolder1_da2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd2" type="text" id="ctl00_ContentPlaceHolder1_dd2" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt3" type="text" id="ctl00_ContentPlaceHolder1_dt3" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da3" type="text" id="ctl00_ContentPlaceHolder1_da3" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd3" type="text" value="Lime- 120 T LHF=8.5 T Dolo=62 I/O= 33T LD Slag R=20 C=43 T" id="ctl00_ContentPlaceHolder1_dd3" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt4" type="text" id="ctl00_ContentPlaceHolder1_dt4" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da4" type="text" id="ctl00_ContentPlaceHolder1_da4" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd4" type="text" value="Si= 0.61 Basicity= 2.55 MgO= 12.64 SiO2=17.31" id="ctl00_ContentPlaceHolder1_dd4" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt5" type="text" id="ctl00_ContentPlaceHolder1_dt5" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da5" type="text" id="ctl00_ContentPlaceHolder1_da5" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd5" type="text" value="DS=0" id="ctl00_ContentPlaceHolder1_dd5" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt6" type="text" id="ctl00_ContentPlaceHolder1_dt6" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da6" type="text" id="ctl00_ContentPlaceHolder1_da6" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd6" type="text" value="A= 1335-3.90/1.33 B= 1373-1.83/2.33 C=1320-1.37/0.99" id="ctl00_ContentPlaceHolder1_dd6" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt7" type="text" id="ctl00_ContentPlaceHolder1_dt7" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da7" type="text" id="ctl00_ContentPlaceHolder1_da7" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd7" type="text" id="ctl00_ContentPlaceHolder1_dd7" style="width:25vw;" /></td>
</tr>
<tr>
<th colspan="3"> </th>
</tr>
<tr>
<th colspan="3" style="background-color:#09236b; color:#fff;">Injuries</th>
</tr>
<tr>
<th colspan="2">Name</th>
<th>Details</th>
</tr>
<tr>
<td colspan="2"><input name="ctl00$ContentPlaceHolder1$in1" type="text" id="ctl00_ContentPlaceHolder1_in1" style="width:10vw;" /></td>
<td><input name="ctl00$ContentPlaceHolder1$id1" type="text" id="ctl00_ContentPlaceHolder1_id1" style="width:25vw;" /></td>
</tr>
<tr>
<td colspan="2"><input name="ctl00$ContentPlaceHolder1$in2" type="text" id="ctl00_ContentPlaceHolder1_in2" style="width:10vw;" /></td>
<td><input name="ctl00$ContentPlaceHolder1$id2" type="text" id="ctl00_ContentPlaceHolder1_id2" style="width:25vw;" /></td>
</tr>
<tr>
<td colspan="2"><input name="ctl00$ContentPlaceHolder1$in3" type="text" id="ctl00_ContentPlaceHolder1_in3" style="width:10vw;" /></td>
<td><input name="ctl00$ContentPlaceHolder1$id3" type="text" id="ctl00_ContentPlaceHolder1_id3" style="width:25vw;" /></td>
</tr>
<tr>
<th colspan="3" style="height:10.5vh;"><input type="hidden" name="ctl00$ContentPlaceHolder1$db_operation" id="ctl00_ContentPlaceHolder1_db_operation" value="update" />
</th>
</tr>
</table>
我需要从每个单元格中提取值。其中一些可能是空白的。
我试过的代码如下:
table = soup.find('table',{'class':'tbl dentry1'})
data = []
table_rows = table.find_all('tr')
l = []
for tr in table_rows:
td = tr.find_all('td')
row = [tr.text for tr in td]
l.append(row)
print(l)
但它只打印:
[[], [], ['Conv - A\xa0', '', '', '', '', 'Mixer-1', '', '', '', ''], ['Conv - B\xa0', '', '', '', '', 'Mixer-2', '', '', '', ''], ['Conv - C\xa0', '', ' ', '', '', '\xa0\xa0'], ['', '倾倒', '等待', '', '', '', '倾倒', '等待', '吨位', ' '], ['Loads', '', '', '', '', 'Torpedos\xa0', '', '', '', ''], ['Slag Yard Trips', '', ' ', '', '', '\r\n Lance Jam Cut\r\n
\n', '', ''], ['', 'Received', 'Consumed', '', '', '', 'Used', 'Successful', '', ''], ['Scrap ', '', '', '', '', 'Dart\xa0', '', '', '', ''], [], ['', 'Bin 1', 'Bin 2', 'Bin 3', 'Bin 4', 'Bin 5', 'Bin 6', 'Bin 7', '', ''], ['Conv - A\xa0', '', '', '', '', '', '', '', '', ''], ['Conv - B\xa0', '', '', '', '', '', '', '', ' ', ''], ['Conv - C\xa0', '', '', '', '', '', '', '','', ''], ['Lime', '', 'Dolo', '', 'Ore', '', 'Coke', '', 'Raw Dolo', ''], ['Heats Recovered' , '', 'Amount Recovered', '', 'Amount Export', '', '', ''], ['Holder Level', '', 'Conv-C Recovery', '', '', ' ', '', '', ''], ['Jam Pot Received', '', 'Clear Pot Sent', '', 'Ram Tree', '', '', '']]'', '', ''], ['Holder Level', '', 'Conv-C Recovery', '', '', '', '', '', ''], ['Jam Pot Received ', '', '清除锅发送', '', 'Ram Tree', '', '', '']]'', '', ''], ['Holder Level', '', 'Conv-C Recovery', '', '', '', '', '', ''], ['Jam Pot Received ', '', '清除锅发送', '', 'Ram Tree', '', '', '']]
我哪里错了?
解决方案
您可以简单地使用pandas
:
import pandas as pd
html = """
<table class="tbl dentry1" style="margin-top:2vh;">
<tr>
<td>Conv - A </td>
<td><input name="ctl00$ContentPlaceHolder1$tha" type="text" value="6" id="ctl00_ContentPlaceHolder1_tha" /></td>
<td><input name="ctl00$ContentPlaceHolder1$la" type="text" value="49" id="ctl00_ContentPlaceHolder1_la" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ta" type="text" value="49" id="ctl00_ContentPlaceHolder1_ta" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tta" type="text" value="7.30" id="ctl00_ContentPlaceHolder1_tta" /></td>
<td>Mixer-1</td>
<td><input name="ctl00$ContentPlaceHolder1$mb1" type="text" id="ctl00_ContentPlaceHolder1_mb1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ml1" type="text" id="ctl00_ContentPlaceHolder1_ml1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mr1" type="text" id="ctl00_ContentPlaceHolder1_mr1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mc1" type="text" id="ctl00_ContentPlaceHolder1_mc1" /></td>
</tr>
<tr>
<td>Conv - B </td>
<td><input name="ctl00$ContentPlaceHolder1$thb" type="text" value="6" id="ctl00_ContentPlaceHolder1_thb" /></td>
<td><input name="ctl00$ContentPlaceHolder1$lb" type="text" value="793" id="ctl00_ContentPlaceHolder1_lb" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tb" type="text" value="58" id="ctl00_ContentPlaceHolder1_tb" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ttb" type="text" value="6.30" id="ctl00_ContentPlaceHolder1_ttb" /></td>
<td>Mixer-2</td>
<td><input name="ctl00$ContentPlaceHolder1$mb2" type="text" value="114" id="ctl00_ContentPlaceHolder1_mb2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ml2" type="text" value="2" id="ctl00_ContentPlaceHolder1_ml2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mr2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mr2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$mc2" type="text" value="136" id="ctl00_ContentPlaceHolder1_mc2" /></td>
</tr>
<tr>
<td>Conv - C </td>
<td><input name="ctl00$ContentPlaceHolder1$thc" type="text" value="4" id="ctl00_ContentPlaceHolder1_thc" /></td>
<td><input name="ctl00$ContentPlaceHolder1$lc" type="text" value="1583" id="ctl00_ContentPlaceHolder1_lc" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tc" type="text" value="9" id="ctl00_ContentPlaceHolder1_tc" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ttc" type="text" value="11.00" id="ctl00_ContentPlaceHolder1_ttc" /></td>
<td colspan="5"> </td>
</tr>
<tr style="text-align:center;">
<td></td>
<td>Poured</td>
<td>Waiting</td>
<td></td>
<td></td>
<td></td>
<td>Poured</td>
<td>Waiting</td>
<td>Tonnnage</td>
<td></td>
</tr>
<tr>
<td>Loads</td>
<td><input name="ctl00$ContentPlaceHolder1$lp" type="text" value="2" id="ctl00_ContentPlaceHolder1_lp" /></td>
<td><input name="ctl00$ContentPlaceHolder1$lw" type="text" value="0" id="ctl00_ContentPlaceHolder1_lw" /></td>
<td></td>
<td></td>
<td>Torpedos </td>
<td><input name="ctl00$ContentPlaceHolder1$tp" type="text" value="8" id="ctl00_ContentPlaceHolder1_tp" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tw" type="text" value="1" id="ctl00_ContentPlaceHolder1_tw" /></td>
<td><input name="ctl00$ContentPlaceHolder1$tt" type="text" value="2407" id="ctl00_ContentPlaceHolder1_tt" /></td>
<td></td>
</tr>
<tr>
<td>Slag Yard Trips</td>
<td><input name="ctl00$ContentPlaceHolder1$syt" type="text" value="21" id="ctl00_ContentPlaceHolder1_syt" /></td>
<td></td>
<td></td>
<td></td>
<td colspan="3">
Lance Jam Cut
<input name="ctl00$ContentPlaceHolder1$ljc" type="text" value="2" id="ctl00_ContentPlaceHolder1_ljc" />
</td>
<td></td>
<td></td>
</tr>
<tr style="text-align:center;">
<td></td>
<td>Received</td>
<td>Consumed</td>
<td></td>
<td></td>
<td></td>
<td>Used</td>
<td>Successful</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Scrap</td>
<td><input name="ctl00$ContentPlaceHolder1$sr" type="text" value="323" id="ctl00_ContentPlaceHolder1_sr" /></td>
<td><input name="ctl00$ContentPlaceHolder1$sc" type="text" value="192" id="ctl00_ContentPlaceHolder1_sc" /></td>
<td></td>
<td></td>
<td>Dart </td>
<td><input name="ctl00$ContentPlaceHolder1$du" type="text" value="8" id="ctl00_ContentPlaceHolder1_du" /></td>
<td><input name="ctl00$ContentPlaceHolder1$ds" type="text" value="7" id="ctl00_ContentPlaceHolder1_ds" /></td>
<td></td>
<td></td>
</tr>
<tr>
<th colspan="10">Bin Position</th>
</tr>
<tr style="text-align:center;">
<td></td>
<td>Bin 1</td>
<td>Bin 2</td>
<td>Bin 3</td>
<td>Bin 4</td>
<td>Bin 5</td>
<td>Bin 6</td>
<td>Bin 7</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conv - A </td>
<td><input name="ctl00$ContentPlaceHolder1$b1a" type="text" value="15" id="ctl00_ContentPlaceHolder1_b1a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b2a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b2a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b3a" type="text" value="45" id="ctl00_ContentPlaceHolder1_b3a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b4a" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b5a" type="text" value="50" id="ctl00_ContentPlaceHolder1_b5a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b6a" type="text" value="25" id="ctl00_ContentPlaceHolder1_b6a" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b7a" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7a" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conv - B </td>
<td><input name="ctl00$ContentPlaceHolder1$b1b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b1b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b2b" type="text" value="5" id="ctl00_ContentPlaceHolder1_b2b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b3b" type="text" value="40" id="ctl00_ContentPlaceHolder1_b3b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b4b" type="text" value="10" id="ctl00_ContentPlaceHolder1_b4b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b5b" type="text" value="60" id="ctl00_ContentPlaceHolder1_b5b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b6b" type="text" value="15" id="ctl00_ContentPlaceHolder1_b6b" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b7b" type="text" value="0" id="ctl00_ContentPlaceHolder1_b7b" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conv - C </td>
<td><input name="ctl00$ContentPlaceHolder1$b1c" type="text" id="ctl00_ContentPlaceHolder1_b1c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b2c" type="text" id="ctl00_ContentPlaceHolder1_b2c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b3c" type="text" value="50" id="ctl00_ContentPlaceHolder1_b3c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b4c" type="text" id="ctl00_ContentPlaceHolder1_b4c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b5c" type="text" value="45" id="ctl00_ContentPlaceHolder1_b5c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b6c" type="text" id="ctl00_ContentPlaceHolder1_b6c" /></td>
<td><input name="ctl00$ContentPlaceHolder1$b7c" type="text" value="35" id="ctl00_ContentPlaceHolder1_b7c" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Lime</td>
<td><input name="ctl00$ContentPlaceHolder1$lime" type="text" value="155" id="ctl00_ContentPlaceHolder1_lime" /></td>
<td style="text-align:center;">Dolo</td>
<td><input name="ctl00$ContentPlaceHolder1$dolo" type="text" value="55" id="ctl00_ContentPlaceHolder1_dolo" /></td>
<td style="text-align:center;">Ore</td>
<td><input name="ctl00$ContentPlaceHolder1$ore" type="text" value="40" id="ctl00_ContentPlaceHolder1_ore" /></td>
<td style="text-align:center;">Coke</td>
<td><input name="ctl00$ContentPlaceHolder1$coke" type="text" id="ctl00_ContentPlaceHolder1_coke" /></td>
<td style="text-align:center;">Raw Dolo</td>
<td><input name="ctl00$ContentPlaceHolder1$rdolo" type="text" id="ctl00_ContentPlaceHolder1_rdolo" /></td>
</tr>
<tr>
<td>Heats Recovered</td>
<td><input name="ctl00$ContentPlaceHolder1$hr" type="text" value="16" id="ctl00_ContentPlaceHolder1_hr" /></td>
<td style="text-align:center;" colspan="2">Amount Recovered</td>
<td><input name="ctl00$ContentPlaceHolder1$ar" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ar" /></td>
<td style="text-align:center;" colspan="2">Amount Export</td>
<td><input name="ctl00$ContentPlaceHolder1$ae" type="text" value="235000" id="ctl00_ContentPlaceHolder1_ae" /></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Holder Level</td>
<td><input name="ctl00$ContentPlaceHolder1$hl" type="text" id="ctl00_ContentPlaceHolder1_hl" /></td>
<td style="text-align:center;" colspan="2">Conv-C Recovery</td>
<td><input name="ctl00$ContentPlaceHolder1$ccr" type="text" id="ctl00_ContentPlaceHolder1_ccr" /></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Jam Pot Received</td>
<td><input name="ctl00$ContentPlaceHolder1$jpr" type="text" value="2" id="ctl00_ContentPlaceHolder1_jpr" /></td>
<td style="text-align:center;" colspan="2">Clear Pot Sent</td>
<td><input name="ctl00$ContentPlaceHolder1$cps" type="text" value="2" id="ctl00_ContentPlaceHolder1_cps" /></td>
<td style="text-align:center;" colspan="2">Ram Tree</td>
<td><input name="ctl00$ContentPlaceHolder1$rt" type="text" id="ctl00_ContentPlaceHolder1_rt" /></td>
<td></td>
<td></td>
</tr>
</table>
<table class="tbl dentry1" style="margin-top:2vh;">
<caption>Delays & Injuries</caption>
<tr>
<th colspan="3" style="background-color:#09236b; color:#fff;">Delays</th>
</tr>
<tr>
<th>Delay Time</th>
<th>Agency</th>
<th>Details</th>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt1" type="text" id="ctl00_ContentPlaceHolder1_dt1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da1" type="text" id="ctl00_ContentPlaceHolder1_da1" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd1" type="text" value="LMB" id="ctl00_ContentPlaceHolder1_dd1" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt2" type="text" id="ctl00_ContentPlaceHolder1_dt2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da2" type="text" id="ctl00_ContentPlaceHolder1_da2" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd2" type="text" id="ctl00_ContentPlaceHolder1_dd2" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt3" type="text" id="ctl00_ContentPlaceHolder1_dt3" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da3" type="text" id="ctl00_ContentPlaceHolder1_da3" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd3" type="text" value="Lime- 120 T LHF=8.5 T Dolo=62 I/O= 33T LD Slag R=20 C=43 T" id="ctl00_ContentPlaceHolder1_dd3" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt4" type="text" id="ctl00_ContentPlaceHolder1_dt4" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da4" type="text" id="ctl00_ContentPlaceHolder1_da4" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd4" type="text" value="Si= 0.61 Basicity= 2.55 MgO= 12.64 SiO2=17.31" id="ctl00_ContentPlaceHolder1_dd4" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt5" type="text" id="ctl00_ContentPlaceHolder1_dt5" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da5" type="text" id="ctl00_ContentPlaceHolder1_da5" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd5" type="text" value="DS=0" id="ctl00_ContentPlaceHolder1_dd5" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt6" type="text" id="ctl00_ContentPlaceHolder1_dt6" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da6" type="text" id="ctl00_ContentPlaceHolder1_da6" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd6" type="text" value="A= 1335-3.90/1.33 B= 1373-1.83/2.33 C=1320-1.37/0.99" id="ctl00_ContentPlaceHolder1_dd6" style="width:25vw;" /></td>
</tr>
<tr>
<td><input name="ctl00$ContentPlaceHolder1$dt7" type="text" id="ctl00_ContentPlaceHolder1_dt7" /></td>
<td><input name="ctl00$ContentPlaceHolder1$da7" type="text" id="ctl00_ContentPlaceHolder1_da7" /></td>
<td><input name="ctl00$ContentPlaceHolder1$dd7" type="text" id="ctl00_ContentPlaceHolder1_dd7" style="width:25vw;" /></td>
</tr>
<tr>
<th colspan="3"> </th>
</tr>
<tr>
<th colspan="3" style="background-color:#09236b; color:#fff;">Injuries</th>
</tr>
<tr>
<th colspan="2">Name</th>
<th>Details</th>
</tr>
<tr>
<td colspan="2"><input name="ctl00$ContentPlaceHolder1$in1" type="text" id="ctl00_ContentPlaceHolder1_in1" style="width:10vw;" /></td>
<td><input name="ctl00$ContentPlaceHolder1$id1" type="text" id="ctl00_ContentPlaceHolder1_id1" style="width:25vw;" /></td>
</tr>
<tr>
<td colspan="2"><input name="ctl00$ContentPlaceHolder1$in2" type="text" id="ctl00_ContentPlaceHolder1_in2" style="width:10vw;" /></td>
<td><input name="ctl00$ContentPlaceHolder1$id2" type="text" id="ctl00_ContentPlaceHolder1_id2" style="width:25vw;" /></td>
</tr>
<tr>
<td colspan="2"><input name="ctl00$ContentPlaceHolder1$in3" type="text" id="ctl00_ContentPlaceHolder1_in3" style="width:10vw;" /></td>
<td><input name="ctl00$ContentPlaceHolder1$id3" type="text" id="ctl00_ContentPlaceHolder1_id3" style="width:25vw;" /></td>
</tr>
<tr>
<th colspan="3" style="height:10.5vh;"><input type="hidden" name="ctl00$ContentPlaceHolder1$db_operation" id="ctl00_ContentPlaceHolder1_db_operation" value="update" />
</th>
</tr>
</table>
"""
dfs = pd.read_html(html)
dfs[0].to_csv("D:\\Table_1.csv", index = False)
截图Table_1.csv
:
编辑::
您无法value
使用此方法获取属性。为此,您必须使用BeautifulSoup
. 这是完整的代码:
from bs4 import BeautifulSoup
import pandas as pd
html = "Your HTML"
soup = BeautifulSoup(html,'html5lib')
table = soup.find('table', class_ = "tbl dentry1")
tr_tags = table.find_all('tr')
final = {}
final = []
for tr in tr_tags:
lst = []
td_tags = tr.find_all('td')
for td in td_tags:
if td.input:
if td.input.has_attr('value'):
lst.append(td.input['value'])
elif td.text != "":
lst.append(td.text.replace('\xa0',''))
else:
lst.append('')
for x in range(10 - len(lst)):
lst.append("")
final.append(lst)
columns = [f'Column {x+1}' for x in range(10)]
df = pd.DataFrame(final,columns=columns)
df.to_csv("D:\\Table_1.csv", index = False, encoding='utf-8')
截图Table_1.csv
:
推荐阅读
- spring-boot - SpringBoot gzip压缩不再工作
- javascript - 如何在 Jest NodeJS 中对 JSON 结果进行单元测试?
- android - 查看绑定未生成
- wordpress - wp_localize_script 不接受动态值
- php - 按字母顺序按首字母分组
- javascript - typeError:无法在角度 9.0.4 中分配给对象“[object Object]”的只读属性“tView”
- gensim - 如何理解 gensim 的 iter 参数及其对预处理的影响?
- php - 从 JSON 到 PHP 获取特定值
- php - 在 Laravel 控制器中加入两个数据库表
- java - 无法将 Junit 添加到我的 Maven 项目(Netbeans)