首页 > 解决方案 > 想使用 python 将我存在于 csv 中的数据相乘

问题描述

例如,我有一个 csv 文件:

CountryId,CountryCode,CountryDescription,CountryRegion,LastUpdatedDate,created_by,updated_by,created_on,update_on
countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020
countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020
countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020

我想将(基本上复制数据框)乘以固定数量的目标行。我如何在 python 中实现这一点,可能超过 30rows 的数据

标签: pythoncsvduplicates

解决方案


有两种方法可以做到这一点:

  1. df.append([df] * 9, ignore_index=True)- 这附加到现有的数据框
  2. pd.concat([df] * 10, ignore_index=True)- 这不会附加到现有数据框
In [31]: df
Out[31]:
      CountryId CountryCode CountryDescription CountryRegion  ... created_by updated_by created_on  update_on
0  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
1  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
2  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020

[3 rows x 9 columns]

In [36]: pd.concat([df] * 10, ignore_index=True)
Out[36]:
       CountryId CountryCode CountryDescription CountryRegion  ... created_by updated_by created_on  update_on
0   countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
1   countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
2   countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
3   countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
4   countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
5   countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
6   countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
7   countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
8   countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
9   countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
10  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
11  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
12  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
13  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
14  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
15  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
16  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
17  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
18  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
19  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
20  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
21  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
22  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
23  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
24  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
25  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
26  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020
27  countryId123          ES              Spain            EU  ...        abc        gfg  7/17/2020  4/17/2020
28  countryId124          US      United States            US  ...        abc        gfg  7/17/2020  4/18/2020
29  countryId125          IT              Italy            EU  ...        abc        gfg  7/17/2020  4/19/2020

如果您需要在没有 pandas 的情况下执行此操作,则 csv 文件也是普通文本文件,但列和值以逗号分隔。

In [43]: with open('a.csv') as csv_file:
    ...:     col_data = csv_file.readlines()
    ...:     column = col_data[0].strip()
    ...:     data = [i.strip() for i in col_data[1:]]
    ...:     new_data = data * 10
    ...:     print(new_data)
    ...:
['countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020', 'countryId123,ES,Spain,EU, 2018-03-29 07:19:00,abc,gfg,7/17/2020,4/17/2020', 'countryId124,US,United States,US, 2018-03-29 07:19:01,abc,gfg,7/17/2020,4/18/2020', 'countryId125,IT,Italy,EU, 2018-03-29 07:19:02,abc,gfg,7/17/2020,4/19/2020']

您可以将其保存new_data到文件中。

更新:

In [44]: with open('a.csv') as csv_file:
    ...:     col_data = csv_file.readlines()
    ...:     column = col_data[0].strip()
    ...:     data = [i.strip().split(",") for i in col_data[1:]]
    ...:     new_data = data * 10
    ...:     print(new_data)
    ...:
[['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020'], ['countryId123', 'ES', 'Spain', 'EU', ' 2018-03-29 07:19:00', 'abc', 'gfg', '7/17/2020', '4/17/2020'], ['countryId124', 'US', 'United States', 'US', ' 2018-03-29 07:19:01', 'abc', 'gfg', '7/17/2020', '4/18/2020'], ['countryId125', 'IT', 'Italy', 'EU', ' 2018-03-29 07:19:02', 'abc', 'gfg', '7/17/2020', '4/19/2020']]

推荐阅读