首页 > 解决方案 > 如何使用数据表过滤 Python Dash 报告列的含义

问题描述

我正在使用 Python Dash 构建一个应用程序,我可以在其中显示破折号数据表,并将数据表(底层 pandas 数据框)的列均值报告为它下方的破折号数据表。到目前为止,我已经尝试过:


import dash
from dash.dependencies import Input, Output
import dash_table
import dash_core_components as dcc
import dash_html_components as html
import pandas as pd
import json
wide_data = pd.DataFrame(np.array([[1.24, 2.34, 3.234], [4.24, .45, .06], [7, 8, 9]]),
                   columns=['a', 'b', 'c'])


df = pd.DataFrame(wide_data)


df = pd.DataFrame(wide_data)
df_floats = df.select_dtypes(include=['float64'])
df_floats_means = df_floats.mean(numeric_only=True)
df_floats_means = pd.DataFrame(df_floats_means)
df_floats_means_T = df_floats_means.T


app = dash.Dash(__name__)

app.layout = html.Div([
    html.H1('Customer Complaints Dashboard'),
    html.Label('Read and Writing Queries Filtering'),
#     html.Label2('If "Read filter_query" is chosen, use operators (le, ge, <, <=, >, >=) + tag value to filter'),    
#     html.Label3('If "Write filter_query" is chosen, ___'),

    dcc.RadioItems(
        id='filter-query-read-write',
        options=[
            {'label': 'Read filter_query', 'value': 'read'},
        ],
        value='read'
    ),

    html.Br(),

    dcc.Input(id='filter-query-input', placeholder='Enter filter query'), #this is for write functionality

    html.Div(id='filter-query-output'), #this is associated with read screen

    html.Hr(),

    dash_table.DataTable(
        id='datatable-advanced-filtering',
        columns=[
            {'name': i, 'id': i, 'deletable': True} for i in df.columns
            # omit the id column
            if i != 'id'
        ],

        data=df.to_dict('records'),
        editable=True,
        page_action='native',
        page_size=10,
        filter_action="native",
    fixed_columns={'headers': True, 'data': 1},
    style_table={'minWidth': '100%'}
    ),

    html.Hr(),
    html.Div(id='datatable-query-structure', style={'whitespace': 'pre'}),
    html.Hr(),
    html.Label('Dataframe Means'),

    dash_table.DataTable(
        id='datatable-advanced-filtering2',
        columns=[
            {'name': i, 'id': i, 'deletable': True} for i in df_floats_means_T.columns
            # omit the id column
            if i != 'id'
        ],

        data=df_floats_means_T.to_dict('records'),
        editable=True,
        page_action='native',
        page_size=10,
     #   filter_action="native",
    fixed_columns={'headers': True, 'data': 1},
    style_table={'minWidth': '100%'}
    ),


    html.Hr(),
    html.Div(id='datatable-query-structure2', style={'whitespace': 'pre'}),
    html.Hr(),



])


@app.callback(
    [Output('filter-query-input', 'style'),
     Output('filter-query-output', 'style')],
    [Input('filter-query-read-write', 'value')]
)
def query_input_output(val):
    input_style = {'width': '100%'}
    output_style = {}
    if val == 'read':
        input_style.update(display='none')
        output_style.update(display='inline-block')
    else:
        input_style.update(display='inline-block')
        output_style.update(display='none')
    return input_style, output_style


@app.callback(
    Output('filter-query-output', 'children'),
    [Input('datatable-advanced-filtering', 'filter_query')]
)
def read_query(query):  
    if query is None:
        return "No filter query"
    return dcc.Markdown('`filter_query = "{}"`'.format(query))

@app.callback(
    Output('datatable-advanced-filtering2', 'data'),
    [Input('datatable-advanced-filtering', 'data')]
)

def update_means(data):
    dff = df
    df_floats = dff.select_dtypes(include=['float64'])
    df_floats_means = df_floats.mean(numeric_only=True)
    df_floats_means = pd.DataFrame(df_floats_means)
    df_floats_means_T = df_floats_means.T

    converted_means = df_floats_means_T.to_dict('records')
    return([converted_means])

if __name__ == '__main__':
    app.run_server(threaded=True)

但我知道这是不对的,因为我在应用程序布局的这一部分进行任何过滤之前计算方法id='datatable-advanced-filtering2'。大家能帮我回电吗?

我是 100% 的 dash 新手,我还没有看到很多关于如何正确执行此操作的指南。非常感谢任何帮助。谢谢你。

标签: pythoncallbackplotly-dash

解决方案


I believe you can use the data property of your main DataTable as the Input to the callback function, and have the Output be the data property of your means DataTable. The structure of that might look something like:

@app.callback(
    Output('datatable-advanced-filtering2', 'data'),
    [Input('datatable-advanced-filtering', 'data')]
)
def update_means(data):
    # Use data to construct dataframe
    # Find means using same method used at top of post
    # Convert means df using .to_dict('records')
    # Return converted dict

The docs have this to say about filtering (emphasis mine):

By default, these transformations are done clientside. Your Dash callbacks can respond to these modifications by listening to the data property as an Input.

You can read more here: https://dash.plotly.com/datatable/interactivity

EDIT

So after a bit of experimentation, I realized that when filter_action is set to 'native', the filtering of the DataTable doesn't actually update its data attribute. Because the attribute is not actually updated, any callback using the data attribute as an Input will not fire upon filtering. In order to get around this and use the data attribute as input to our callback, we need to set filter_action to 'custom', and then implement the filtering callback ourselves. Luckily, the Dash docs had an example of how to do this here: https://dash.plotly.com/datatable/filtering

Using the information in those docs, I was able to get an example to work. I'll leave it up to you to figure out the inner workings of the filtering method, but the following code runs locally for me and shows the means DataTable updating when the main DataTable is filtered. The only other modification I made was to the update_means function whereby I changed it from returning a list as you had in your example code, to returning a dict which is the correct type for the data attribute.

import dash
from dash.dependencies import Input, Output
import dash_table
import dash_core_components as dcc
import dash_html_components as html
import pandas as pd
import json
import numpy as np
from dash import Dash

# Instantiate df and calculate initial means
wide_data = pd.DataFrame(np.array([[1.24, 2.34, 3.234], [4.24, .45, .06], [7, 8, 9]]),
                         columns=['a', 'b', 'c'])
df = pd.DataFrame(wide_data)
df_floats = df.select_dtypes(include=['float64'])
df_floats_means = df_floats.mean(numeric_only=True)
df_floats_means = pd.DataFrame(df_floats_means)
df_floats_means_T = df_floats_means.T

# Operators used by custom filtering
operators = [['ge ', '>='],
             ['le ', '<='],
             ['lt ', '<'],
             ['gt ', '>'],
             ['ne ', '!='],
             ['eq ', '='],
             ['contains '],
             ['datestartswith ']]

# Initialize app
app = Dash(__name__)

# Init layout
app.layout = html.Div([
    html.H1('Customer Complaints Dashboard'),
    html.Label('Read and Writing Queries Filtering'),
    #     html.Label2('If "Read filter_query" is chosen, use operators (le, ge, <, <=, >, >=) + tag value to filter'),
    #     html.Label3('If "Write filter_query" is chosen, ___'),

    dcc.RadioItems(
        id='filter-query-read-write',
        options=[
            {'label': 'Read filter_query', 'value': 'read'},
        ],
        value='read'
    ),

    html.Br(),

    # this is for write functionality
    dcc.Input(id='filter-query-input', placeholder='Enter filter query'),

    html.Div(id='filter-query-output'),  # this is associated with read screen

    html.Hr(),

    dash_table.DataTable(
        id='datatable-advanced-filtering',
        columns=[
            {'name': i, 'id': i, 'deletable': True} for i in df.columns
            # omit the id column
            if i != 'id'
        ],

        data=df.to_dict('records'),
        editable=True,
        page_action='native',
        page_size=10,
        filter_action="custom",
        fixed_columns={'headers': True, 'data': 1},
        style_table={'minWidth': '100%'}
    ),

    html.Hr(),
    html.Div(id='datatable-query-structure', style={'whitespace': 'pre'}),
    html.Hr(),
    html.Label('Dataframe Means'),

    dash_table.DataTable(
        id='datatable-advanced-filtering2',
        columns=[
            {'name': i, 'id': i, 'deletable': True} for i in df_floats_means_T.columns
            # omit the id column
            if i != 'id'
        ],

        data=df_floats_means_T.to_dict('records'),
        editable=True,
        page_action='native',
        page_size=10,
        filter_action="native",
        fixed_columns={'headers': True, 'data': 1},
        style_table={'minWidth': '100%'}
    ),


    html.Hr(),
    html.Div(id='datatable-query-structure2', style={'whitespace': 'pre'}),
    html.Hr(),



])


@app.callback(
    [Output('filter-query-input', 'style'),
     Output('filter-query-output', 'style')],
    [Input('filter-query-read-write', 'value')]
)
def query_input_output(val):
    input_style = {'width': '100%'}
    output_style = {}
    if val == 'read':
        input_style.update(display='none')
        output_style.update(display='inline-block')
    else:
        input_style.update(display='inline-block')
        output_style.update(display='none')
    return input_style, output_style


@app.callback(
    Output('filter-query-output', 'children'),
    [Input('datatable-advanced-filtering', 'filter_query')]
)
def read_query(query):
    if query is None:
        return "No filter query"
    return dcc.Markdown('`filter_query = "{}"`'.format(query))


# Callback to re-calculate means after filtering is
@app.callback(
    Output('datatable-advanced-filtering2', 'data'),
    [Input('datatable-advanced-filtering', 'data')]
)
def update_means(data):
    # Calculate means from data currently stored in top datatable
    dff = pd.DataFrame.from_dict(data)
    df_floats = dff.select_dtypes(include=['float64'])
    df_floats_means = df_floats.mean(numeric_only=True)
    df_floats_means = pd.DataFrame(df_floats_means)
    df_floats_means_T = df_floats_means.T

    # Return means to means datatable
    #THIS NOW RETURNS DICT INSTEAD OF LIST
    converted_means = df_floats_means_T.to_dict('records')
    return converted_means


def split_filter_part(filter_part):
    '''Helper function for custom filtering'''
    for operator_type in operators:
        for operator in operator_type:
            if operator in filter_part:
                name_part, value_part = filter_part.split(operator, 1)
                name = name_part[name_part.find('{') + 1: name_part.rfind('}')]

                value_part = value_part.strip()
                v0 = value_part[0]
                if (v0 == value_part[-1] and v0 in ("'", '"', '`')):
                    value = value_part[1: -1].replace('\\' + v0, v0)
                else:
                    try:
                        value = float(value_part)
                    except ValueError:
                        value = value_part

                # word operators need spaces after them in the filter string,
                # but we don't want these later
                return name, operator_type[0].strip(), value

    return [None] * 3


@app.callback(
    Output('datatable-advanced-filtering', "data"),
    [Input('datatable-advanced-filtering', "filter_query")])
def update_table(filter):
    '''Callback that handles custom filtering of top datatable'''
    if filter is None:
        return df.to_dict('records')
    filtering_expressions = filter.split(' && ')
    dff = df
    for filter_part in filtering_expressions:
        col_name, operator, filter_value = split_filter_part(filter_part)

        if operator in ('eq', 'ne', 'lt', 'le', 'gt', 'ge'):
            # these operators match pandas series operator method names
            dff = dff.loc[getattr(dff[col_name], operator)(filter_value)]
        elif operator == 'contains':
            dff = dff.loc[dff[col_name].str.contains(filter_value)]
        elif operator == 'datestartswith':
            # this is a simplification of the front-end filtering logic,
            # only works with complete fields in standard format
            dff = dff.loc[dff[col_name].str.startswith(filter_value)]

    return dff.to_dict('records')


if __name__ == '__main__':
    app.run_server(threaded=True)


推荐阅读