首页 > 解决方案 > 如何在 Flutter 中解析带有行跨度的 HTML 表格?

问题描述

我正在尝试解析其中包含行跨度的 HTML 表。

我遇到的问题是,如果该行包含一个行跨度,则下一行缺少一个 TD,其中行跨度现在是缺少的 TD。任何帮助将不胜感激,谢谢!

大学日程表

颤振应用

HTML表格:

<table cellspacing="0" cellpadding="0" class="list">
    <tbody>
        <tr>
            <th>교시(Class)</th>
            <th>월(Mon)</th>
            <th>화(Tue)</th>
            <th>수(Wed)</th>
            <th>목(Thu)</th>
            <th>금(Fri)</th>
            <th>토(Sat)</th>
        </tr>
        <tr class="tr_st">
            <td>1교시<br>(09:00~09:50)</td>
            <td></td>
            <td rowspan="3">항공기술영어<br>나래관 601호&lt;br></td>
            <td></td>
            <td rowspan="3">비행기역학&lt;br>나래관 204호&lt;br></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>2교시<br>(10:00~10:50)</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            </tr>
        <tr class="tr_st">
            <td>3교시<br>(11:00~11:50)</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>4교시<br>(12:00~12:50)</td>
            <td></td>
            <td></td>
            <td rowspan="3">비파괴검사개론&lt;br>나래관 404호&lt;br></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>5교시<br>(13:00~13:50)</td>
            <td rowspan="2">항공기기체기초실습I<br>나래관 301호 항공기체·헬기정비 실습실&lt;br></td>
            <td rowspan="2">항공기전자기초실습I<br>나래관 302호 항공전자실습실&lt;br></td>
            <td rowspan="3">항공법규<br>나래관 204호&lt;br></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>6교시<br>(14:00~14:50)</td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>7교시<br>(15:00~15:50)</td>
            <td rowspan="2">항공정비일반<br>나래관 202호 어학실&lt;br></td>
            <td></td>
            <td rowspan="3">항공계기I<br>나래관 502호&lt;br></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>8교시<br>(16:00~16:50)</td>
            <td></td>
            <td rowspan="2">항공기기관I<br>나래관 204호&lt;br></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>9교시<br>(17:00~17:50)</td>
            <td rowspan="1">인성학I<br>나래관 202호 어학실&lt;br></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>10교시<br>(18:00~18:50)</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>11교시<br>(19:00~19:50)</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr class="tr_st">
            <td>12교시<br>(20:00~20:50)</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
    </tbody>
</table>

我试过的

飞镖代码:

final response = await http.get(Url.Schedule, headers: LoginScreenController.cookieheaders);
var document = parse(response.body);
var rows = document.getElementsByClassName('list')[0].getElementsByTagName('tr');

for (int i = 1; i < rows.length; i++) {
    var _mon = document
      .getElementsByClassName('list')[0]
      .getElementsByTagName('tr')[i]
      .getElementsByTagName('td')[1];

    var _tue = document
      .getElementsByClassName('list')[0]
      .getElementsByTagName('tr')[i]
      .getElementsByTagName('td')[2];

    var _wed = document
      .getElementsByClassName('list')[0]
      .getElementsByTagName('tr')[i]
      .getElementsByTagName('td')[3];

    var _thu = document
      .getElementsByClassName('list')[0]
      .getElementsByTagName('tr')[i]
      .getElementsByTagName('td')[4];

    var _fri = document
      .getElementsByClassName('list')[0]
      .getElementsByTagName('tr')[i]
      .getElementsByTagName('td')[5];

    var _sat = document
      .getElementsByClassName('list')[0]
      .getElementsByTagName('tr')[i]
      .getElementsByTagName('td')[6];
          
    mon[i] = _mon.text;
    tue[i] = _tue.text;
    wed[i] = _wed.text;
    thu[i] = _thu.text;
    fri[i] = _fri.text;
    sat[i] = _sat.text;
}

标签: htmlflutterdarthtml-tablehtml-parsing

解决方案


一种方法是将该 HTML 表格转换为矩阵并使用它:

import 'package:_samples2/data/data5.dart';
import 'package:html/parser.dart' as html show parse;
import 'package:html/dom.dart';


class AnyServer {
  static Document fetchSchedule() => html.parse(raw_table); 

  static void convertTableToMatrix() {
    var rows = fetchSchedule().getElementsByClassName('list')[0].getElementsByTagName('tr');
    var cols = rows[0].getElementsByTagName('th');
    var schedule = List.generate(rows.length, (_) => List<String?>.generate(cols.length, (_) => null));
    
    var r = 0;
    rows.forEach((row) {
      cols = row.getElementsByTagName(r == 0 ? 'th' : 'td');
      var c = 0;
      cols.forEach((col) {
        while (schedule[r][c] != null) { c++; }
        schedule[r][c] = col.text;
        if (col.attributes.keys.contains('rowspan')) {
          var rs = int.tryParse(col.attributes['rowspan']!) ?? 0;
          if (rs > 1) {
            for (var i = 1; i < rs; i++) {
              schedule[r+i][c] ??= col.text;
            }
          }
        }
        c++;
      });
      r++;
    });
    print(schedule);
  }
}

void main(List<String> args) {
  AnyServer.convertTableToMatrix();
}

结果:

[
  [교시(Class), 월(Mon), 화(Tue), 수(Wed), 목(Thu), 금(Fri), 토(Sat)], 
  [1교시(09:00~09:50), , 항공기술영어나래관 601호, , 비행기역학나래관 204호, , ], 
  [2교시(10:00~10:50), , 항공기술영어나래관 601호, , 비행기역학나래관 204호, , ], 
  [3교시(11:00~11:50), , 항공기술영어나래관 601호, , 비행기역학나래관 204호, , ], 
  [4교시(12:00~12:50), , , 비파괴검사개론나래관 404호, , , ], 
  [5교시(13:00~13:50), 항공기기체기초실습I나래관 301호 항공기체·헬기정비 실습실, 항공기전자기초실습I나래관 302호 항공전자실습실, 비파괴검사개론나래관 404호, 항공법규나래관 204호, , ], 
  [6교시(14:00~14:50), 항공기기체기초실습I나래관 301호 항공기체·헬기정비 실습실, 항공기전자기초실습I나래관 302호 항공전자실습실, 비파괴검사개론나래관 404호, 항공법규나래관 204호, , ], 
  [7교시(15:00~15:50), 항공정비일반나래관 202호 어학실, , 항공계기I나래관 502호, 항공법규나래관 204호, , ], 
  [8교시(16:00~16:50), 항공정비일반나래관 202호 어학실, , 항공계기I나래관 502호, 항공기기관I나래관 204호, , ], 
  [9교시(17:00~17:50), 인성학I나래관 202호 어학실, , 항공계기I나래관 502호, 항공기기관I나래관 204호, , ], 
  [10교시(18:00~18:50), , , , , , ], 
  [11교시(19:00~19:50), , , , , , ], 
  [12교시(20:00~20:50), , , , , , ]
]

推荐阅读