首页 > 解决方案 > 您如何使抽象的嵌套字典/哈希表对象可读(样式)?

问题描述

我正在解析导出到 csv 的数据库表,其中嵌入的字段本质上是一个备忘录字段。数据库还包含版本历史记录,而 csv 包含所有版本。数据的基本结构是Index(顺序记录号)、Reference(特定外键)、Sequence(给定引用的记录顺序)和Data(要解析的数据的备注字段)。

您可以将“数据”字段视为限制为 80 个字符宽和 40 个字符深的文本文档,然后按照它们打印的顺序进行排序。每个记录条目都分配有一个升序索引。作为参考,$myParser 是 [Microsoft.VisualBasic.FileIO.TextFieldParser],因此 ReadFields() 将一行字段作为数组/列表返回。

我的最终问题是,如何将其格式化为对读者更直观?下面的代码是powershell,我也会对与C#相关的答案感兴趣,因为它是一个与语言无关的风格问题,尽管我认为get/set会在某种程度上使这变得微不足道。

考虑以下代码(2 深嵌套字典/哈希中的插入/更新例程):

enum cmtField
{
    Index = 0
    Sequence = 1
    Reference = 2
    Data = 4
}

$myRecords = [System.Collections.Generic.Dictionary[int,System.Collections.Generic.Dictionary[int,string]]]::new() #this could be a hash table, but is more verbose this way
While($true) #there's actually control here, but this provides a simple loop assuming infinite data
{
    $myFields = $myParser.ReadFields() #read a line from the csvfile and return an array/list of fields for that line

    if(!$myRecords.ContainsKey($myFields[[cmtField]::Reference])) #if the reference of the current record is new
    {
        $myRecords.Add($myFields[[cmtField]::Reference],[System.Collections.Generic.Dictionary[int,CommentRecord]]::new()) #create tier 1 reference index
        $myRecords[$myFields[[cmtField]::Reference]].add($myFields[[cmtField]::Sequence],$myFields[[cmtField]::Data]) #create tier 2 sequence reference and data
    }
    else #if the reference aklready exists in the dictionary
    {
        if(!$myRecords[$myFields[[cmtField]::Reference]].ContainsKey($myFields[[cmtField]::Sequence])) #if the sequence ID of the current record is new
        {
            $myRecords[$myFields[[cmtField]::Reference]].Add($myFields[[cmtField]::Sequence],$myFields[[cmtField]::Data]) #add record at [reference][sequence]
        }
        else #if the sequence already exists for this reference
        {
            if($myRecords[$myFields[[cmtField]::Reference]][$myFields[[cmtField]::Sequence]].Index -lt $myFields[[cmtField]::Index]) #if the index of the currently read field is higher than the store index, it must be newer
            {
                $myRecords[$myFields[[cmtField]::Reference]][$myFields[[cmtField]::Sequence]] = $myFields[[cmtField]::Data] #replace with new data
            }
            #else discard currently read data (do nothing
        }
    }
}

坦率地说,试图让这个可读性既让我头疼,我的眼睛也有点流血。字典越深入,它只会变得越来越混乱。我被困在支架汤和没有自我记录之间。

标签: powershellpowershell-5.0

解决方案


我的最终问题是,如何将其格式化为对读者更直观?

那……最终取决于谁是“读者”——是你的老板吗?你的同事?我?您会使用此代码示例向某人教授编程吗?

在减少“混乱”方面,您可以立即采取几个步骤。

为了使您的代码更具可读性,我要更改的第一件事using namespace是在文件顶部添加一个指令:

using namespace System.Collections.Generic

现在您可以使用以下命令创建嵌套字典:

[Dictionary[int,Dictionary[int,string]]]::new()

...相对于:

[System.Collections.Generic.Dictionary[int,System.Collections.Generic.Dictionary[int,string]]]::new()

我要减少的下一件事是重复的索引访问模式,例如$myFields[[cmtField]::Reference]- 在循环顶部的初始分配之后您永远不会修改$myFields,因此无需延迟解决它。

while($true)
{
    $myFields = $myParser.ReadFields()

    $Reference = $myFields[[cmtField]::Reference]
    $Data      = $myFields[[cmtField]::Data]
    $Sequence  = $myFields[[cmtField]::Sequence]
    $Index     = $myFields[[cmtField]::Index]

    if(!$myRecords.ContainsKey($Reference)) #if the reference of the current record is new
    {
        $myRecords.Add($Reference,[Dictionary[int,CommentRecord]]::new()) #create tier 1 reference index
        $myRecords[$Reference].Add($Sequence,$Data) #create tier 2 sequence reference and data
    }
    else 
    {
        # ...

最后,您可以通过放弃嵌套的 if/else 语句来极大地简化代码,而只需将其分解为必须逐个通过的一系列步骤,您最终会得到如下内容:

using namespace System.Collections.Generic

enum cmtField
{
    Index = 0
    Sequence = 1
    Reference = 2
    Data = 4
}

$myRecords = [Dictionary[int,Dictionary[int,CommentRecord]]]::new() 
while($true) 
{
    $myFields = $myParser.ReadFields()

    $Reference = $myFields[[cmtField]::Reference]
    $Data = $myFields[[cmtField]::Data]
    $Sequence = $myFields[[cmtField]::Sequence]
    $Index = $myFields[[cmtField]::Index]

    # Step 1 - ensure tier 1 dictionary is present
    if(!$myRecords.ContainsKey($Reference))
    {
        $myRecords.Add($Reference,[Dictionary[int,CommentRecord]]::new())
    }
    
    # (now we only need to resolve `$myRecords[$Reference]` once)
    $record = $myRecords[$Reference]

    # step 2 - ensure sequence entry exists
    if(!$record.ContainsKey($Sequence))
    {
        $record.Add($Sequence, $Data)
    }

    # step 3 - handle superceding comment records
    if($record[$Sequence].Index -lt $Index) 
    {
        $record[$Sequence] = $Data 
    }
}

我个人觉得这比原来的 if/else 方法在眼睛(和头脑)上更容易


推荐阅读