我正在寻找有关将存储在行中的数据转换为存储在表中的数据的帮助。
背景。。。 我正在使用一个包含住院数据的表格。让我们称这张表为"住院患者"。
数据当前格式化为包含 3 列和n行的表格。这 3 列包含以下数据:
对于单个患者 (Patient_ID(,每次住院 (Event_ID( 由表中的一行或多行表示,其中一行用于给定住院期间记录的每个诊断。
因此,任何给定的住院时间都可能由表中的一行(一个记录的诊断(或表中的多行(与多个诊断相关联(捕获。
下面给出了当前"住院患者"表的示例...
-------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code
-------------------------------------------
Pers001 | HospStay001 | C139
Pers001 | HospStay001 | I245
Pers001 | HospStay001 | D456
Pers001 | HospStay002 | C139
Pers001 | HospStay002 | J123
Pers555 | HospStay001 | D312
Pers999 | HospStay001 | C120
Pers999 | HospStay001 | E101
以下是我真正想做的:我想转换数据,以便每个患者的每次住院时间只有一行,以便上表的格式如下:
----------------------------------------------------------------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code_1 | Diagnosis_Code_2 | Diagnosis_Code_3 | Diagnosis_Code_n
----------------------------------------------------------------------------------------------------
Pers001 | HospStay001 | C139 | I245 | D456 |
Pers001 | HospStay002 | C139 | J123 | |
Pers555 | HospStay001 | D312 | | |
Pers999 | HospStay001 | C120 | E101 | |
我怀疑该解决方案需要一些动态 sql...恐怕不是我的强项之一。
谢谢!
CREATE table #source (Patient_ID varchar(100), Event_ID varchar (100) ,Diagnosis_Code VARCHAR(100),Dig_Number INT)
insert into #source (Patient_ID, Event_ID,Diagnosis_Code,Dig_Number) values
('Pers001','HospStay001','I245',2),
('Pers001','HospStay001','D456',3),
('Pers001','HospStay002','C139',1),
('Pers001','HospStay002','J123',2),
('Pers555','HospStay001','D312',1),
('Pers999','HospStay001','C120',1),
('Pers999','HospStay001','E101',2),
('Pers001','HospStay001','C139',1)
--DROP TABLE tempdb..#source
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
SELECT @cols = STUFF
(
(
SELECT ',' + QUOTENAME( CONVERT(VARCHAR(10),Dig_Number))
FROM #source
GROUP BY Dig_Number
ORDER BY Dig_Number
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),
1,1,''
);
SET @query = 'SELECT Patient_ID,Event_ID,' + @cols + '
FROM
(
SELECT Patient_ID,Event_ID,Diagnosis_Code,dig_number
FROM #source
) x
PIVOT
(
MAX(Diagnosis_Code)
FOR Dig_Number IN (' + @cols + ')
) p ';
EXECUTE(@query);
如果再增加一列,即诊断数字,它将起作用。
感謝Rajat Jaiswal、LeasMaps和Tim Biegeleisen的貢獻。 非常感谢。
关键是建议在原始表中添加一个额外的列以用作转换后的表中的列标题。 事实证明,这相对容易做到(我在MS Excel中做到了这一点(。
所以我的原始表格被编辑成这样...
--------------------------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code | DiagCode_Counter |
--------------------------------------------------------------
Pers001 | HospStay001 | C139 | Diagnosis_Code_1 |
Pers001 | HospStay001 | I245 | Diagnosis_Code_2 |
Pers001 | HospStay001 | D456 | Diagnosis_Code_3 |
Pers001 | HospStay002 | C139 | Diagnosis_Code_1 |
Pers001 | HospStay002 | J123 | Diagnosis_Code_2 |
Pers555 | HospStay001 | D312 | Diagnosis_Code_1 |
Pers999 | HospStay001 | C120 | Diagnosis_Code_1 |
Pers999 | HospStay001 | E101 | Diagnosis_Code_2 |
--------------------------------------------------------------
在新添加的"DiagCode_Counter">字段中,每次根据唯一的"Event_ID">记录新的Diagnosis_Code值时,数字后缀都会递增 1。
然后,我能够在MS Access中创建交叉表查询,使用">Patient_ID">和"Event_ID"字段作为行标题;"DiagCode_Counter"字段作为列标题;"Diagnosis_Code">条目作为值。
> Rajat 是正确的 - 你需要某种列来用于创建diagnosis_column_1、dignosis_column_2......
要在 ms-access 中执行此操作,我将: 1. 创建虚拟列以计算诊断列 2.使用VBA填充它(对于大型数据库更快(,如下所示
Sub Update_Diagnosis_Code_ID()
Dim db As DAO.Database
'Dim qdf As DAO.QueryDef
Dim rs As DAO.Recordset
Dim TmpRecord As String
Dim dummyId As Integer
Dim patientID As String
Dim eventID As String
Dim lastDummyId As Integer
Dim lastpatientID As String
Dim lasteventID As String
Dim i As Integer
pstrSQL = "SELECT Inpat.Dummy_id, Inpat.Patient_id, Inpat.Event_ID, Inpat.Diagnosis_Code FROM Inpat ORDER BY Inpat.Patient_id, Inpat.Event_ID;"
Set db = CurrentDb
Set rs = db.OpenRecordset(pstrSQL)
dummyId = 0
With rs
If Not .EOF Then
'first record
.MoveFirst
patientID = rs.Fields(1) '
eventID = rs.Fields(2) '
.Edit
rs.Fields(0) = dummyId + 1
.Update
.MoveNext
Do While Not .EOF
'store the values from the last record
lastpatientID = patientID
lasteventID = eventID
'get the new values
patientID = rs.Fields(1) '
eventID = rs.Fields(2) '
'new patient or new hospital stay
If patientID <> lastpatientID Or eventID <> lasteventID Then
dummyId = 0 'reset back to 1
Else
dummyId = dummyId + 1
End If
.Edit
rs.Fields(0) = dummyId + 1
.Update
.MoveNext
Loop
End If
End With
rs.Close
Set rs = Nothing
Set dbs = Nothing
MsgBox "Finished", vbExclamation
End Sub
然后,如果使用交叉表显示数据:
TRANSFORM First(Inpat.[Diagnosis_Code]) AS FirstOfDiagnosis_Code
SELECT Inpat.[Patient_id], Inpat.[Event_ID], Count(Inpat.[Diagnosis_Code])
AS [Total Of Diagnosis_Code]
FROM Inpat
GROUP BY Inpat.[Patient_id], Inpat.[Event_ID]
PIVOT Inpat.[Dummy_id];