首页 > 解决方案 > 使用指纹设备的考勤日志文件。如何显示在场和缺席?

问题描述

我从设备的指纹日志中获取此文件:

Id User_id      PuchTime
--------------------------
1   152      2018-07-17 09:38:03
2   184      2018-07-17 16:56:43
3   152      2018-07-17 16:57:18
4   165      2018-07-17 16:57:43
5   70       2018-07-17 16:57:59
6   134      2018-07-17 16:58:28
7   276      2018-07-17 16:59:04
8   278      2018-07-17 16:59:05
9   271      2018-07-17 16:59:10
10  268      2018-07-17 16:59:13
11  284      2018-07-17 16:59:16
12  364      2018-07-17 16:59:35
13  19       2018-07-17 16:59:38
14  381      2018-07-17 17:01:12
15  73       2018-07-17 17:12:31
16  126      2018-07-17 17:12:36
17  382      2018-07-17 17:13:50
18  53       2018-07-18 06:34:13
19  284      2018-07-18 08:05:17

如何在 Postgres 查询中进行查询以提取数据,如下所示:

User_id  Check_Date   TimeIN    TimeOUT   Hours   status
--------------------------------------------------------
152      2018-07-17    09:38:03 16:56:43  7.8     present
152      2018-07-18                               Absent  

我使用这个查询

SELECT userid, name, CAST(PuchTime as DATE) Check_Date, 
to_char(PuchTime, 'day') days,
       MIN(CAST(PuchTime as Time)) TimeIN, 
       MAX(Cast(PuchTime as Time)) TimeOUT,
       CAST(MAX(PuchTime) - MIN(PuchTime) AS Time) As hour
FROM attendance_FHLHR
GROUP BY userid,name, CAST(PuchTime as DATE), to_char(PuchTime, 'day')
order by name DESC, check_date ASC, userid ASC

Output of my query:

我需要状态和小时计算。

    User_id  Check_Date   TimeIN    TimeOUT   Hours
    -----------------------------------------------
    152      2018-07-17    09:38:03 16:56:43  7:18:40  
    152      2018-07-18    

标签: sqlpostgresql

解决方案


DISCLAIMER: (This one is for both: this one here and this one)

demo: db<>fiddle


WITH dates AS(                                                   -- 1
    SELECT
        min(checktime)::date as min,
        max(checktime)::date as max
    FROM log
)

SELECT 
    user_id,
    check_date::date,
    -- 4: 
    CASE WHEN checktime::date = check_date THEN checktime::time  ELSE NULL END as time_in,
    CASE WHEN checktime::date = check_date THEN time_out::time ELSE NULL END as time_out,
    CASE WHEN checktime::date = check_date THEN (time_out - checktime)::time ELSE NULL END as hours
FROM (
    SELECT
        user_id,
        checktime,
        lead(checktime) OVER (ORDER BY checktime) as time_out,  -- 2
        generate_series(                                        -- 3
            (SELECT min FROM dates), 
            (SELECT max FROM dates), 
            interval '1 day'
        ) as check_date
    FROM log
)s
ORDER BY user_id, check_date
  1. Calculate the min/max date thresholds of your log to get bounds for date generating
  2. lead window function takes the next checktime value to the current row. So the next checktime counts as time_out
  3. generate_series generates all date values between given (calculated) bounds
  4. CASE parts: If the current checktime equals not the generated date value then give out NULL; else give out the current time_in / time_out / time difference


Concerning the nearly duplicated expanded question: Using Attendance Log and OFF days Table . How to show present ,absents and OFF day? I am adding the answer here because the duplication should be closed and the question above should be expanded.

demo, see second part of fiddle above

WITH dates AS(
    SELECT
        min(checktime)::date as min,
        max(checktime)::date as max
    FROM log
)
SELECT DISTINCT ON (user_id, check_date, time_in)   -- 6
    user_id, 
    check_date, 
    to_char(check_date, 'Day') as day,              -- 2
    COALESCE(time_in,                               -- 4
         MAX(time_in) OVER (PARTITION BY user_id, check_date ORDER BY time_out NULLS LAST)
    ) as time_in, 
    time_out, 
    hours,
    CASE                                            -- 5
        WHEN checktime::date = check_date THEN 'present' 
        WHEN of.days IS NOT NULL THEN 'OFF DAY'
        ELSE 'absent'
    END as status
FROM (
    SELECT 
        user_id,
        check_date,
        checktime,
        CASE WHEN checktime::date = check_date THEN checktime::time  ELSE NULL END as time_in,
        CASE WHEN checktime::date = check_date THEN time_out::time ELSE NULL END as time_out,
        -- 1
        CASE WHEN checktime::date = check_date THEN extract(epoch FROM (time_out - checktime)) / 60 / 60 ELSE NULL END as hours
    FROM (
        SELECT
            user_id,
            checktime,
            lead(checktime) OVER (ORDER BY checktime) as time_out,
            generate_series(
                (SELECT min FROM dates), 
                (SELECT max FROM dates), 
                interval '1 day'
            ) as check_date
        FROM log
    ) s
) s
--- 3
LEFT JOIN off_days of ON (of.userid = s.user_id) AND (of.days = trim(to_char(check_date, 'day')))
ORDER BY user_id, check_date

Because it this is a expansion of the previous query I explain only the changes:

  1. Instead of giving the time difference as time, a numeric value is needed. So extract(epoch...) gets the seconds of the difference which are converted into hours by / 60 / 60
  2. Converting a date into a weekday with to_char function
  3. Joining the off_day table against the user_id and the the week day (again using the to_char function, this time with small capitals). to_char adds whitespace - so trim() removes it for the comparison
  4. Tricky part (together with 6): Because the join duplicates the rows, it is necessary to eliminate the wrong ones. It is not possible to do a simple DISTINCT on user_id and week day because e.g. 152 has two entries on one day. But because 53 has two entries on different days (in my example, see fiddle) for both dates a valid and an empty row is created. This code line duplicats the time_in value into the empty row (next step see (6))
  5. If there's an entry for the generated date (see first part above) it is present. If not, check if is an "off day", otherwise absent
  6. We have the cases: A: no duplicated rows; B: two rows per user_id and week day because there are two entries; C: two rows (with empty times) because of the join. We want no duplicate rows, so there's the distinct. This works also for (C) because we duplicated the time_in for the NULL rows in (4)

推荐阅读