perl - 比较来自 2 个文件的列并以与 file1 中相同的顺序打印匹配和不匹配的行并在匹配和不匹配行的末尾打印 YES/NO
问题描述
文件1
3 14573 ab712 A T
8 12099 ab002 G A
9 12874 ab790 A C
3 19879 ab734 G T
文件2
3 14573 ab712 A T
9 12874 ab790 A C
输出
3 14573 ab712 A T YES
8 12099 ab002 G A NO
9 12874 ab790 A C YES
3 19879 ab734 G T NO
我在 file1 & 2 上尝试了 perl foreach 循环,
生成的输出如下 -
3 14573 ab712 A T YES
8 12099 ab002 G A NO
9 12874 ab790 A C NO
3 19879 ab734 G T NO
4 34565 ab992 C G NO
9 12874 ab790 A C YES
3 14573 ab712 A T NO
8 12099 ab002 G A NO
9 12874 ab790 A C NO
3 19879 ab734 G T NO
4 34565 ab992 C G NO
我试过的脚本
foreach $arr1 (@arr1) {
chomp $arr1;
($chr1, $pos1, $id1, $ref1, $alt1) = split(/\t/, $arr1);
foreach $arr2 (@arr2) {
chomp $arr2;
($chr2, $pos2, $id2, $ref2, $alt2) = split(/\s/, $arr2);
{
if (($pos1 eq $pos2 ) && ($chr1 eq $chr2 )) {
print "$chr1\t$pos1\t$ref1\t$alt1\tYES\n";
} else {
print "$chr1\t$pos1\t$ref1\t$alt1\tNO\n"
}
}
}
}
解决方案
您的代码相当复杂,所以恐怕我没有时间理解它并纠正您做错的任何事情。
但是,我确实有时间介绍我的解决方案(附评论):
#!/usr/bin/perl
# Always use these
use strict;
use warnings;
# Open file2...
open my $fh2, '<', 'file2' or die $!;
# ... and use its contents to construct a hash.
# The key of the hash is the line of data from the
# file (without the newline) and the value is the
# number 1.
# We can therefore use this hash to work out if a
# given line from file1 exists in file2.
my %file2 = map { chomp; $_ => 1 } <$fh2>;
# Open file1...
open my $fh1, '<', 'file1' or die $!;
# ... and process it a line at a time
while (<$fh1>) {
# Remove the newline
chomp;
# Print the line
print;
# Find out if the line exists in file2
# and print 'YES' or 'NO' as appropriate.
print $file2{$_} ? ' YES' : ' NO';
# Print a newline.
print "\n";
}
更新:这是一个仅匹配输入数据的前两个字段的版本(考虑到示例输入,这无关紧要,但您的代码暗示这就是您想要匹配的内容)。
#!/usr/bin/perl
# Always use these
use strict;
use warnings;
# Open file2...
open my $fh2, '<', 'file2' or die $!;
# ... and use its contents to construct a hash.
# The key of the hash is the first two fields from
# the line of data from the file and the value is the
# number 1.
# We can therefore use this hash to work out if a
# given line from file1 exists in file2.
my %file2 = map { join(' ', (split)[0,1]) => 1 } <$fh2>;
# Open file1...
open my $fh1, '<', 'file1' or die $!;
# ... and process it a line at a time
while (<$fh1>) {
# Remove the newline
chomp;
# Print the line
print;
# Find out if the line exists in file2
# and print 'YES' or 'NO' as appropriate.
print $file2{join ' ', (split)[0,1]} ? ' YES' : ' NO';
# Print a newline.
print "\n";
}
推荐阅读
- clips - 对一组事实 CLIPS 进行排序
- java - 为什么将信息从 Java 代码更新到 SQL 数据库时会出现错误?
- excel-formula - 索引 Minif excel
- ruby-on-rails - Rails 深度嵌套连接
- informatica-powercenter - 如何不考虑 Informatica Source 中的列数据中存在的行分隔符
- mysql - mysql - 选择大小写优先结果
- r - 在 Google 趋势请求循环中跳过错误
- excel - 如何在 vba excel 中解压缩 http 响应?
- azure - Azure 门户,图形 api:来宾用户
- haskell - 要么是 b。两年后的不同 Hoogle 结果?