首页 > 解决方案 > How to efficiently submit multiple SQL query strings in Python?

问题描述

I have a python function that takes in a list of people (first name, last name, and date of birth stored in a dict), loops through this list and queries the database person by person, and returns a list containing the results found.

So for example if the user would want to query 3 people using the list [{'name': 'Alice'}, {'name': 'Bob'}, {'name': 'Charlie'}], the function would query the database 3 times, each time getting a list of potential matches that gets put into another list. The could return a list of 3 lists each containing matching results for Alice, Bob and Charlie. The function looks like the following.

query_string = "SELECT * FROM some_db"

def improved_batch_client_check(user_requests, db_connection):
    responses = []

    for one_request in user_requests:
        first_name = one_request.get('FirstName')
        last_name = one_request.get('LastName')
        dob = one_request.get('BirthDate')

        query_string_end = " WHERE db.NAME LIKE '%{}%' AND db.NAME LIKE '%{}%'".format(first_name, last_name)

        if dob is None:
            pass
        else:
            query_string_end += " AND db.DOB = '{}'".format(str(dob))
            

        df_candidates = pandas.read_sql(query_string + query_string_end, db_connection)
        responses.append(df_candidates.to_dict('records'))

    return jsonify(responses)

What I want to do is to improve efficiency by having to call the function pandas.read_sql() only once by somehow submitting a list of different queries that returns dataframes in a similar manner to what is used below. What is the easiest way to do it in Python?

标签: pythonsqlquery-optimization

解决方案


为了使其成为一个查询,我将创建一个或的查询,即 -

select * from table where (request1) or (request2) ...

request如果没有给出 dob 或-

db.NAME LIKE '%{}%' AND db.NAME LIKE '%{}%'".format(first_name, last_name)

如果给出 dob -

db.NAME LIKE '%{}%' AND db.NAME LIKE '%{}%' AND db.DOB = '{}'".format(first_name, last_name, str(dob))

把它们放在一起——

def improved_batch_client_check(user_requests, db_connection):
    requests = []
    for one_request in user_requests:
        first_name = one_request.get('FirstName')
        last_name = one_request.get('LastName')
        dob = one_request.get('BirthDate')

        curr_request = "db.NAME LIKE '%{}%' AND db.NAME LIKE '%{}%'".format(first_name, last_name)

        if dob is not None:
            curr_request += " AND db.DOB = '{}'".format(str(dob))
        requests.append(curr_request)    
     
    query_string = "SELECT * FROM some_db WHERE " + 'or '.join([f'({x})' for x in requests])

    df = pandas.read_sql(query_string, db_connection)
    return jsonify (df.to_dict('records'))

推荐阅读