首页 > 解决方案 > 将多个 json 对象转换为单个数据帧/csv


我是 python 新手。

我想知道如何为多个 url 运行下面代码的相同过程。

# 代码 '1,运行良好

url ='https://toyama.com.br/wp-json/wp/v2/assistencia?local=914&ramo=&_embed&per_page=100'
header = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
df = pd.read_json(url)
resp = requests.get(url, headers=header)
pandas_data_frame1 = df['acf'].apply(pd.Series)
pandas_data_frame1.to_csv ('teste2.CSV', encoding ='utf-8-sig')

# Code2,它工作不完美(多个 url,重要的是要注意一些 url 存在而另一些不存在,我需要处理这个结构)

url1 =['https://toyama.com.br/wp-json/wp/v2/assistencia?local=914&ramo=&_embed&per_page=100',

header = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

for links in url1:
    df = pd.read_json(links)
    resp1 = requests.get(links, headers=header)
    data = json.loads(resp1.text)
    for d in data:
        pandas_data_frame1 = df['acf'].apply(pd.Series)
        pandas_data_frame1.to_csv ('teste2.CSV', encoding ='utf-8-sig') 

#unfortunately 只保存链接的内容 'https://toyama.com.br/wp-json/wp/v2/assistencia?local=1207&ramo=&_embed&per_page=100'

我需要的是有一个 csv,其中我将 json 键作为一列,就像代码 1 一样。


标签: jsondataframecsv



不过,每次阅读新链接信息时,您都在替换 csv 文件。这就是您的代码仅保存最后一个链接信息的原因。



# Links for scapping web data
url1 =['https://toyama.com.br/wp-json/wp/v2/assistencia?local=914&ramo=&_embed&per_page=100',

header = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

# Creating a counter to tell the code when join dataframes.
# For the first case we just create the dataframe. Others case we will join them into a single dataframe.
cont = 0

# Scrapping the Data
for links in url1:
    cont += 1
    # Printing which url link the code is reading
    print('loop:' + str(cont))
    df = pd.read_json(links)
    resp1 = requests.get(links, headers=header)
    data = json.loads(resp1.text)
    # First dataframe processing.
    if cont == 1:
        for d in data:
            complete_df = df['acf'].apply(pd.Series)
    # Others dataframe processing
         for d in data:
            others_df = df['acf'].apply(pd.Series)
            complete_df = pd.concat([complete_df, others_df])

# Removing duplciates from the dataframe. I am not sure why but apparently the code is reading few json files.
complete_df = pandas_data_frame1.drop_duplicates()

# Saving CSV file.
complete_df.to_csv ('teste2.CSV', encoding ='utf-8-sig')

