我是SQL和stackoverflow的新手,所以请怜悯我。我有 3 张桌子(如下所述)。我已经搜索了很多这个特定问题,但没有找到解决方案。我是一名研究生,研究孩子们多年来发展的行为和心理。我通过观察一个孩子的行为10分钟并记录他们每分钟在做什么来做到这一点。我为不同年龄的多个孩子做这件事,多年来我多次关注每个孩子。因此,数据是混合纵向的(多年来,相同的个体由多个数据点表示)。我还应该注意,我目前正在使用 Access。
10分钟的比赛称为"关注",每分钟的活动称为"扫描"。多年来,我已将这些数据输入到一个 Excel 工作表中,其中包含关注日期、扫描时间戳、个人 ID、活动和其他一些内容。此后,我将其分解为两个表:一个follow_id表,其中包括唯一的关注 ID 号、每个关注的日期以及关注的个人 ID 等信息。然后,我为每次扫描制作了第二个表,每次扫描都有它follow_id号相关联,这是它来自的追随者的唯一 ID 号。我希望这是有道理的,我尝试输入表格,但我无法正确设置格式。请原谅我的n00bness。
我需要做的是整理所有扫描,比如说,1岁以下的婴儿,然后是1-2岁的婴儿,等等。这很困难,因为我几乎所有的主题都在多个年龄箱中代表。换句话说,我可能在 9 个月大时跟踪个体 A,然后在 16 个月大时再次跟踪。所以我做了一个生日表,上面有每个人的身份证和他们的生日。
我已经尝试了一堆子查询和连接等等,但很明显我真的不知道我在做什么。如果有人能指出我正确的方向,我将不胜感激。例如,我应该从联接开始吗?或者这一切都可以只通过子查询工作吗?任何见解都会有所帮助,并将不胜感激。
tbl_biography
+------------------------------+
| individual_id | birth_date |
+------------------------------+
| AA | 2016-01-01 |
| BB | 2013-01-01 |
| CC | 2014-01-01 |
+------------------------------+
tbl_follow_id
+-------------------------------------------+
| follow_id | individual_id | follow_date |
+-------------------------------------------+
| 0001 | BB | 2013-12-12 |
| 0002 | BB | 2018-01-01 |
| 0003 | BB | 2015-01-01 |
| 0004 | CC | 2016-01-01 |
| 0005 | AA | 2017-01-01 |
+-------------------------------------------+
tbl_scan_id
-follow_id: 0001; 0001; 0001; 0002; 0002; 0002; 0003; 0003; 0003; 0004; 0004; 0004; 0005; 0005; 0005
-scan_id: 00001; 00002; 00003; 00004; 00005; 00006; 00007; 00008; 00009; 00010; 00011; 00012; 00013; 00014; 00015
-timestamp: (I don’t think this really matters here, but each scan would have hh:mm)
-scan: various behaviors such as REST, PLAY, EAT, etc.
我应该指出,您可以看到个体 AA 和 BB 在数据集中分别以 2 个单独的关注表示,并且每个跟随的年龄都不同。问题是,例如,如何过滤掉0-1岁,1-2岁等的所有扫描?目前,将所有个人聚集在一起是可以的。但是在未来,能够由某些人做到这一点将是惊人的,比如基于性别、母亲和我在传记表中也有的其他变量。提前感谢大家。
首先,您需要一个计算年龄的正确函数(见下文),因为 DateDiff 仅返回日历年的差异。
然后我会创建并保存一个查询,如下所示:
SELECT
Years([birth_date],[follow_date]) AS age,
tbl_biography.individual_id,
tbl_follow.follow_id
FROM
tbl_biography
INNER JOIN
tbl_follow
ON tbl_biography.individual_id = tbl_follow.individual_id
ORDER BY
Years([birth_date],[follow_date]),
tbl_biography.individual_id,
tbl_follow.follow_id;
这将产生以下输出:
age individual_id follow_id
0 BB 0001
1 AA 0005
2 BB 0003
2 CC 0004
5 BB 0002
现在,使用此查询并在新查询中tbl_scan,您可以在follow_id上联接这些查询。最后,从tbl_scan中添加字段并根据需要进行排序/过滤。
该函数:
Public Function Years( _
ByVal datDate1 As Date, _
ByVal datDate2 As Date, _
Optional ByVal booLinear As Boolean) _
As Integer
' Returns the difference in full years between datDate1 and datDate2.
'
' Calculates correctly for:
' negative differences
' leap years
' dates of 29. February
' date/time values with embedded time values
' negative date/time values (prior to 1899-12-29)
'
' Optionally returns negative counts rounded down to provide a
' linear sequence of year counts.
' For a given datDate1, if datDate2 is decreased step wise one year from
' returning a positive count to returning a negative count, one or two
' occurrences of count zero will be returned.
' If booLinear is False, the sequence will be:
' 3, 2, 1, 0, 0, -1, -2
' If booLinear is True, the sequence will be:
' 3, 2, 1, 0, -1, -2, -3
'
' If booLinear is False, reversing datDate1 and datDate2 will return
' results of same absolute Value, only the sign will change.
' This behaviour mimics that of Fix().
' If booLinear is True, reversing datDate1 and datDate2 will return
' results where the negative count is offset by -1.
' This behaviour mimics that of Int().
' DateAdd() is used for check for month end of February as it correctly
' returns Feb. 28. when adding a count of years to dates of Feb. 29.
' when the resulting year is a common year.
'
' 2000-11-03. Cactus Data ApS, CPH.
' 2000-12-16. Leap year correction modified to be symmetrical.
' Calculation of intDaysDiff simplified.
' Renamed from YearsDiff() to Years().
' 2000-12-18. Introduced cbytMonthDaysMax.
' 2007-06-22. Version 2. Complete rewrite.
' Check for month end of February performed with DateAdd()
' after idea of Markus G. Fischer.
Dim intDiff As Integer
Dim intSign As Integer
Dim intYears As Integer
' Find difference in calendar years.
intYears = DateDiff("yyyy", datDate1, datDate2)
' For positive resp. negative intervals, check if the second date
' falls before, on, or after the crossing date for a full 12 months period
' while at the same time correcting for February 29. of leap years.
If DateDiff("d", datDate1, datDate2) > 0 Then
intSign = Sgn(DateDiff("d", DateAdd("yyyy", intYears, datDate1), datDate2))
intDiff = Abs(intSign < 0)
Else
intSign = Sgn(DateDiff("d", DateAdd("yyyy", -intYears, datDate2), datDate1))
If intSign <> 0 Then
' Offset negative count of years to continuous sequence if requested.
intDiff = Abs(booLinear)
End If
intDiff = intDiff - Abs(intSign < 0)
End If
' Return count of years as count of full 12 months periods.
Years = intYears - intDiff
End Function
要用月份计算,请使用类似的函数来正确计算:
Public Function Months( _
ByVal datDate1 As Date, _
ByVal datDate2 As Date, _
Optional ByVal booLinear As Boolean) _
As Integer
' Returns the difference in full months between datDate1 and datDate2.
'
' Calculates correctly for:
' negative differences
' leap years
' dates of 29. February
' date/time values with embedded time values
' negative date/time values (prior to 1899-12-29)
'
' Optionally returns negative counts rounded down to provide a
' linear sequence of month counts.
' For a given datDate1, if datDate2 is decreased stepwise one month from
' returning a positive count to returning a negative count, one or two
' occurrences of count zero will be returned.
' If booLinear is False, the sequence will be:
' 3, 2, 1, 0, 0, -1, -2
' If booLinear is True, the sequence will be:
' 3, 2, 1, 0, -1, -2, -3
'
' If booLinear is False, reversing datDate1 and datDate2 will return
' results of same absolute Value, only the sign will change.
' This behaviour mimics that of Fix().
' If booLinear is True, reversing datDate1 and datDate2 will return
' results where the negative count is offset by -1.
' This behaviour mimics that of Int().
' DateAdd() is used for check for month end of February as it correctly
' returns Feb. 28. when adding a count of months to dates of Feb. 29.
' when the resulting year is a common year.
'
' 2010-03-30. Cactus Data ApS, CPH.
Dim intDiff As Integer
Dim intSign As Integer
Dim intMonths As Integer
' Find difference in calendar months.
intMonths = DateDiff("m", datDate1, datDate2)
' For positive resp. negative intervals, check if the second date
' falls before, on, or after the crossing date for a 1 month period
' while at the same time correcting for February 29. of leap years.
If DateDiff("d", datDate1, datDate2) > 0 Then
intSign = Sgn(DateDiff("d", DateAdd("m", intMonths, datDate1), datDate2))
intDiff = Abs(intSign < 0)
Else
intSign = Sgn(DateDiff("d", DateAdd("m", -intMonths, datDate2), datDate1))
If intSign <> 0 Then
' Offset negative count of months to continuous sequence if requested.
intDiff = Abs(booLinear)
End If
intDiff = intDiff - Abs(intSign < 0)
End If
' Return count of months as count of full 1 month periods.
Months = intMonths - intDiff
End Function
您可以使用DateDiff
函数计算"跟随年龄",以获得birth_date
和follow_date
之间的差异:
DateDiff("yyyy", tbl_biography.birth_date, tbl_follow_id.follow_date)
可以将该表达式放入查询设计器中的字段中。
您需要联接所有三个表(因为您希望从tbl_scan_id
中获得结果,并且您需要tbl_follow_id
follow_date
和tbl_biography
birth_date
)。
如果表定义了适当的关系,则查询设计器将联接它们,但如果没有,则需要手动联接它们(将individual_id
拖在一起,follow_id
一起联接)。
示例 SQL 如下:
SELECT
tbl_scan_id.*,
DateDiff("yyyy", tbl_biography.birth_date, tbl_follow_id.follow_date) AS AgeAtFollow,
tbl_follow_id.follow_date,
tbl_biography.individual_id,
tbl_biography.birth_date
FROM
(tbl_follow_id
INNER JOIN tbl_biography
ON tbl_follow_id.individual_id = tbl_biography.individual_id)
INNER JOIN tbl_scan_id
ON tbl_follow_id.follow_id = tbl_scan_id.follow_id;
然后,只需根据需要过滤AgeAtFollow字段(或按其分组等)。您还可以重新调整该查询的用途,并添加其他传记字段以进行过滤(性别、母亲、人口统计等)。