首页 > 解决方案 > 如何在 lambda 函数中创建一个 If 语句,该语句返回查找所有字符串中没有的内容

问题描述

基本上这是我的脚本如下:

# importing pandas module with an alias of PD
import pandas as pd

# importing python regex 
import re

# data contains the content of the .csv file using pandas read_csv function
data = pd.read_csv('data1010.csv')

# shows only the tags that are matching Up and have a digit between 0-9, then picks up any characters between the _ and the commma
data['tags'] = data['tags'].apply(lambda x : ",".join(re.findall("Up\d?_\S*(?=,)", x)))

# Excludes the tags that are blank
data = data[ (data['tags'] == "") == False]

#creates a new column called total_tags and returns a count of how many elements are between commas
data["total_tags"] = data["tags"].apply(lambda x : len(x.split(',')))

# prints first 5 lines of csv
print(data.head())
# exports everything to test.csv and removes the index column
data.to_csv("test.csv", index = False)

我现在要做的是对于每个不匹配的标签 join(re.findall("Up\d?_\S*(?=,)", x) 我希望它在同一列中返回.

所以它现在要做的是返回以下内容:

+--------------+-------+-----------------+--------------+--------------+-------------+
| product_id   |  sku  |    total_sold   |     tags     | total_images | total_tags  |
+--------------+-------+-----------------+--------------+--------------+-------------+
| grgeggre     | rgerg |             456 | Up1_, Up2    |            5 |           2 |
| grgrer       | agag  |             431 |              |            5 |             |
+--------------+-------+-----------------+--------------+--------------+-------------+

(如果没有 up_ 标签,则删除并留空)

我希望它继续做同样的事情,但它不会填充空白,而是只返回不包含 Up_、Up*_ 的标签。下面是一个例子,但是是一个假标签,因为它们都是不同的:

+--------------+-------+-----------------+--------------+--------------+-------------+
| product_id   |  sku  |    total_sold   |     tags     | total_images | total_tags  |
+--------------+-------+-----------------+--------------+--------------+-------------+
| grgeggre     | rgerg |             456 | Up1_, Up2    |            5 |           2 |
| grgrer       | agag  |             431 | tag-c, tag-d |            5 |           2 |
+--------------+-------+-----------------+--------------+--------------+-------------+

标签: pythonpandascsvlambda

解决方案


data['tags'] = data['tags'].apply(lambda x : ",".join(re.findall("Up\d?_\S*(?=,)", x)) if re.findall("Up\d?_\S*(?=,)", x) else x )

这适用于任何感兴趣的人。


推荐阅读