首页 > 解决方案 > 将 DataFrame 附加到现有的 google 工作表而不附加标题行

问题描述

我目前可以使用下面的代码将我的 DF 附加到现有的谷歌工作表中。我遇到的问题是该脚本还附加了 DF 的标题行(列名行)。我只想附加常规行。有没有办法让我在下面的代码中指定这一点?

### READ DF ###
df = pd.read_excel('dfsheet.xlsx', index_col=None)

### GOOGLE API CREDENTIALS ###  
CLIENT_SECRET_FILE = 'client_secret json file location'
API_SERVICE_NAME = 'sheets'
API_VERSION = 'v4'
SCOPES = ['https://www.googleapis.com/auth/drive']
gsheetID = 'My gsheet ID'

### MAKE GOOGLE API CONNECTION ###
service = Create_Service(CLIENT_SECRET_FILE, API_SERVICE_NAME, API_VERSION, SCOPES)

### THIS CALL APPENDS THE DF TO THE SPREADSHEET ###
response_date = service.spreadsheets().values().append(spreadsheetId=gsheetID,valueInputOption='RAW',range='Sheet1!A1',body=dict(majorDimension='ROWS', values=df.T.reset_index().T.values.tolist())).execute()

**旁注我已尝试通过编辑 pd.read_excel 行以包含“header = None”来解决问题...

df = pd.read_excel('dfsheet.xlsx', index_col=None, header=None)

但这所做的是创建一个标题行,其中包含数字作为列名。然后,此标题行与 DF 的其余部分一起附加。

标签: pythonpandasdataframe

解决方案


我不明白你为什么使用.Tfor values 。


我得到正确的数据 - 没有标题 - 如果我使用

values=df.values.tolist()

基于:通过 API 更新 Google 表格,无需数据框标头


编辑:

如果你必须使用,.T那么你可以简单地跳过第一行[1:]

values=df.T.reset_index().T.values.tolist()[1:]

编辑:

我用于测试的完整工作代码

import os
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.discovery import build

import pandas as pd

#-------------------------------------------------------------------------------------------

def Create_Service(client_secret_file, api_service_name, api_version, *scopes):

    cred = None

    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.json') and os.stat('token.json').st_size > 0:
        creds = Credentials.from_authorized_user_file('token.json', SCOPES)

    # If there are no (valid) credentials available, let the user log in.
    if not cred or not cred.valid:
        if cred and cred.expired and cred.refresh_token:
            cred.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(client_secret_file, SCOPES)
            cred = flow.run_local_server()
        with open('token.json', 'w') as token:
            token.write(cred.to_json())

    try:
        service = build(api_service_name, api_version, credentials=cred)
        print(api_service_name, 'service created successfully')
        return service
    except Exception as e:
        print(e)
        return None

#-------------------------------------------------------------------------------------------
### READ DF ###
#-------------------------------------------------------------------------------------------

#df = pd.read_excel('test/articles.xlsx', index_col=None)

#-------------------------------------------------------------------------------------------
### CREATE DF FOR TEST ###
#-------------------------------------------------------------------------------------------

data = {
    'X': [1, 2, 3],
    'Y': [4, 5, 6],
    'Z': [7, 8, 9]
} # columns
df = pd.DataFrame(data)

print(df)

#-------------------------------------------------------------------------------------------
### GOOGLE API CREDENTIALS ###
#-------------------------------------------------------------------------------------------

CLIENT_SECRET_FILE = '/home/furas/.creds/read-sheet-gdrive-desktop-client-1.json'
API_SERVICE_NAME = 'sheets'
API_VERSION = 'v4'
SCOPES = ['https://www.googleapis.com/auth/drive']

gsheetID = ''   # empty ID to create new spreadsheet
url = ''
gsheetID = '11S8eES4aLj3s35WMO35BcjRSLOQlymiaQdBSb_cqCaI'
url = 'https://docs.google.com/spreadsheets/d/11S8eES4aLj3s35WMO35BcjRSLOQlymiaQdBSb_cqCaI/edit'

#-------------------------------------------------------------------------------------------
### MAKE GOOGLE API CONNECTION ###
#-------------------------------------------------------------------------------------------

service = Create_Service(CLIENT_SECRET_FILE, API_SERVICE_NAME, API_VERSION, SCOPES)

#-------------------------------------------------------------------------------------------
### CREATE EMTPY SPREADSHEET ###
#-------------------------------------------------------------------------------------------

if not gsheetID:
    spreadsheet_body = {
        'properties': {
            'title': 'Stackoverflow example'
        },
    }

    request = service.spreadsheets().create(
        body=spreadsheet_body,
#        fields='spreadsheetId'
    )
    response = request.execute()

    gsheetID = response['spreadsheetId']
    url = response['spreadsheetUrl']

print('gsheetID:', gsheetID)
print('url:', url)

#-------------------------------------------------------------------------------------------
### APPEND NEW DATA TO SPREADSHEET (few times) ###
#-------------------------------------------------------------------------------------------

# using original method - it should add headers

for _ in range(2):
    response_date = service.spreadsheets().values().append(
        spreadsheetId=gsheetID,
        valueInputOption='RAW',
        #range='Sheet1!A1',
        range='A1',
        body=dict(majorDimension='ROWS', values=df.T.reset_index().T.values.tolist())
    ).execute()

# using `[1:]`

for _ in range(2):
    response_date = service.spreadsheets().values().append(
        spreadsheetId=gsheetID,
        valueInputOption='RAW',
        #range='Sheet1!A1',
        range='A1',
        body=dict(majorDimension='ROWS', values=df.T.reset_index().T.values.tolist()[1:])
    ).execute()

# using `df.values.tolist()`

for _ in range(2):
    response_date = service.spreadsheets().values().append(
        spreadsheetId=gsheetID,
        valueInputOption='RAW',
        #range='Sheet1!A1',
        range='A1',
        body=dict(majorDimension='ROWS', values=df.values.tolist())
    ).execute()

推荐阅读