首页 > 解决方案 > 为什么使用连接子句连接两个相关的 CSV 文件会产生单个元素而不是元素序列?

问题描述

我想根据匹配值合并两个 CSV 文件。这两个 CSV 文件通过 ID 字段相关联,ID 字段是第一个 CSV 文件 (names.csv) 中的第三字段,以及第二个 CSV 文件 (scores.csv) 中的第一字段。

名称.csv:

Adams,Terry,120  
Fakhouri,Fadi,116  
Feng,Hanying,117  
Garcia,Cesar,114  
Garcia,Debra,115  
Garcia,Hugo,118  
Mortensen,Sven,113  
O'Donnell,Claire,112  
Omelchenko,Svetlana,111  
Tucker,Lance,119  
Tucker,Michael,122  
Zabokritski,Eugene,121

分数.csv:

111, 97, 92, 81, 60  
112, 75, 84, 91, 39  
113, 88, 94, 65, 91  
114, 97, 89, 85, 82  
115, 35, 72, 91, 70  
116, 99, 86, 90, 94  
117, 93, 92, 80, 87  
118, 92, 90, 83, 78  
119, 68, 79, 88, 92  
120, 99, 82, 81, 79  
121, 96, 85, 91, 60  
122, 94, 92, 91, 91  

代码

            string[] names = File.ReadAllLines(@"/Users/username/Projects/ProjectA/names.csv");
            string[] marks = File.ReadAllLines(@"/Users/username/Projects/ProjectA/scores.csv");


            IEnumerable<Student> queryStudents =
                from name in names
                join mark in marks on name.Split(",")[2] equals mark.Split(',')[0]
                select new Student
                {
                    FirstName = name.Split(',')[0],
                    LastName = name.Split(',')[1],
                    ID = mark.Split(',')[0],
                    Scores = new List<int>
                  {int.Parse(mark.Split(',')[1]),
                  int.Parse(mark.Split(',')[2]),
                  int.Parse(mark.Split(',')[3]),
                  int.Parse(mark.Split(',')[4])
                  }
                };

            List<Student> students = queryStudents.ToList();

            foreach (Student student in students)
            {
                Console.WriteLine("The average score of {0} {1} is {2}.", student.FirstName, student.LastName, student.Scores.Average());
            }

我期望上述查询的以下输出名为queryStudents

The average score of Omelchenko Svetlana is 82.5.
    The average score of O'Donnell Claire is 72.25.
    The average score of Mortensen Sven is 84.5.
    The average score of Garcia Cesar is 88.25.
    The average score of Garcia Debra is 67.
    The average score of Fakhouri Fadi is 92.25.
    The average score of Feng Hanying is 88.
    The average score of Garcia Hugo is 85.75.
    The average score of Tucker Lance is 81.75.
    The average score of Adams Terry is 85.25.
    The average score of Zabokritski Eugene is 83.
    The average score of Tucker Michael is 92.

但是,执行查询时我只得到一个元素queryStudents

The average score of Zabokritski Eugene is 83.

标签: c#linqcsvjoin

解决方案


我认为问题在于 names.csv 在 id 字段的末尾有额外的空格,除了最后一个。这就是 Zabokritski,Eugene 的记录正确显示在输出中的原因。

您可以通过更改此行来解决此问题:

join mark in marks on name.Split(",")[2] equals mark.Split(',')[0]

在比较之前修剪空格,如下所示:

join mark in marks on name.Split(",")[2].Trim() equals mark.Split(',')[0].Trim()

但是,在这一小段代码中发生了很多事情。我发现在调试时更容易将其分解为具有特定目的的单独方法(例如,创建名称的方法,单独的分数方法,连接结果的方法)

更新:根据要求,这是我在调试时拆分更新代码的方式。这比您的版本长得多(我并不是说这是编写代码的最佳方式),但对我来说,它更容易理解代码的每个部分并跟踪问题:

        static void Main(string[] args)
        {
            string[] names = File.ReadAllLines(@"/Users/username/Projects/ProjectA/names.csv");
            string[] marks = File.ReadAllLines(@"/Users/username/Projects/ProjectA/scores.csv");

            var students = CreateStudents(names);
            var scores = CreateScores(marks);
            var averageScores = CreateAverageScores(students, scores);

            DisplayResults(averageScores);
        }

        private static List<Student> CreateStudents(string[] names)
        {
            return names.Select(name => new Student
            {
                FirstName = name.Split(',')[0],
                LastName = name.Split(',')[1], 
                ID = name.Split(',')[2].Trim()
            }).ToList();
        }

        private static List<Score> CreateScores(string[] marks)
        {
            return marks.Select(mark => new Score
            {
                ID = mark.Split(',')[0].Trim(),
                Values = new List<int>
                {
                    int.Parse(mark.Split(',')[1]),
                    int.Parse(mark.Split(',')[2]),
                    int.Parse(mark.Split(',')[3]),
                    int.Parse(mark.Split(',')[4])
                }
            }).ToList();
        }

        private static IEnumerable<AverageScore> CreateAverageScores(List<Student> students, List<Score> scores)
        {
            var studentScores =
                from student in students
                join score in scores on student.ID equals score.ID
                select new AverageScore
                {
                    FirstName = student.FirstName,
                    LastName = student.LastName,
                    Average = score.Values.Average()
                };
            return studentScores;
        }

        private static void DisplayResults(IEnumerable<AverageScore> studentScores)
        {
            foreach (var studentScore in studentScores)
            {
                Console.WriteLine("The average score of {0} {1} is {2}.", studentScore.FirstName, studentScore.LastName,
                    studentScore.Average);
            }
        }

推荐阅读