data-warehouse - Dimensional Modeling: app session or activity measures
问题描述
I am trying to answer the below question given by the business (The business generates revenue from multiple apps through customer pay model) The business is interested in the below questions
- new users (trend with respect to previous months)
- daily active users
- Day 1 retention I came up with the below DM
Dimension: users, app, deviceid, useractions, plan, date
Fact: fact_activity(userid, appid,deviceid, actionid)
Actions could be: app installed, app launch, registered, completed purchase, postedcomments, playgame etc
The questions I have is
- should the fact table contain action_type instead of the actionid into the fact (to avoid join with useractions)
- Definition of day 1 retention: No of apps installed/ app launches next day how do to avoid multiple counting of single user using multiple devices
- Would it be advisable to have device details in the user dimension or separate.
- If I need to measure average session duration, should I use another fact at session level or tweak the activity fact?
解决方案
如果没有关于您的业务流程、数据定义等的更多信息,您的问题真的无法回答。实际上,您是在要求某人在他们回答您的问题之前为您设计一个维度模型——这显然不会发生。
但是,我可以为您提供一些非常通用的指针,它们可能会对您有所帮助:
方面
维度描述了一个实体,因此如果属性不能被描述为属于同一个实体,那么它们不应该在同一个维度中。在您的情况下,我假设 Device 和 User 不是同一事物,因此它们需要是单独的维度
事实
你需要定义你的度量,即你想要聚合的东西到底是什么(计数、总和、平均值等)以及它们是如何定义/计算的。
对于每个度量,您还需要定义其粒度,即唯一标识它的最小维度集是什么。一旦定义了粒度,如果多个度量具有相同的粒度,那么它们可以保存在同一个事实表中,如果没有,那么它们就不能