SQL - 将具有编码文本值的行转换/转置为列



我正在寻找有关将存储在行中的数据转换为存储在表中的数据的帮助。

背景。。。 我正在使用一个包含住院数据的表格。让我们称这张表为"住院患者"。

数据当前格式化为包含 3 列和n行的表格。这 3 列包含以下数据:

  • "Patient_ID">=唯一的患者/人员标识符。 将此视为患者的名字;
  • "Event_ID">=唯一的准入事件标识符。 识别医院中独特的护理事件;
  • "Diagnosis_Code">=ICD-10代码,用于记录患者住院的原因。

    对于单个患者 (Patient_ID(,每次住院 (Event_ID( 由表中的一行或多行表示,其中一行用于给定住院期间记录的每个诊断。

    因此,任何给定的住院时间都可能由表中的一行(一个记录的诊断(表中的多行(与多个诊断相关联(捕获。

    下面给出了当前"住院患者"表的示例...

    -------------------------------------------
    Patient_ID |  Event_ID   |  Diagnosis_Code
    -------------------------------------------
    Pers001    | HospStay001 |     C139
    Pers001    | HospStay001 |     I245
    Pers001    | HospStay001 |     D456
    Pers001    | HospStay002 |     C139
    Pers001    | HospStay002 |     J123
    Pers555    | HospStay001 |     D312
    Pers999    | HospStay001 |     C120
    Pers999    | HospStay001 |     E101
    

    以下是我真正想做的:我想转换数据,以便每个患者的每次住院时间只有一行,以便上表的格式如下:

    ----------------------------------------------------------------------------------------------------
    Patient_ID |  Event_ID   | Diagnosis_Code_1 | Diagnosis_Code_2 | Diagnosis_Code_3 | Diagnosis_Code_n
    ----------------------------------------------------------------------------------------------------
    Pers001    | HospStay001 |       C139       |       I245       |       D456       |
    Pers001    | HospStay002 |       C139       |       J123       |                  |
    Pers555    | HospStay001 |       D312       |                  |                  |
    Pers999    | HospStay001 |       C120       |       E101       |                  |
    

    我怀疑该解决方案需要一些动态 sql...恐怕不是我的强项之一。

    谢谢!

  • CREATE  table #source (Patient_ID varchar(100), Event_ID varchar (100) ,Diagnosis_Code VARCHAR(100),Dig_Number INT)
    insert into #source (Patient_ID, Event_ID,Diagnosis_Code,Dig_Number) values
    ('Pers001','HospStay001','I245',2),
    ('Pers001','HospStay001','D456',3),
    ('Pers001','HospStay002','C139',1),
    ('Pers001','HospStay002','J123',2),
    ('Pers555','HospStay001','D312',1),
    ('Pers999','HospStay001','C120',1),
    ('Pers999','HospStay001','E101',2),
    ('Pers001','HospStay001','C139',1)
    
    --DROP TABLE tempdb..#source
    
    DECLARE @cols AS NVARCHAR(MAX),
    @query AS NVARCHAR(MAX)
    SELECT @cols = STUFF
    (
    (
    SELECT ',' + QUOTENAME( CONVERT(VARCHAR(10),Dig_Number))
    FROM #source
    GROUP BY Dig_Number
    ORDER BY Dig_Number
    FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'),
    1,1,''
    );
    SET @query = 'SELECT Patient_ID,Event_ID,' + @cols + ' 
    FROM
    (
    SELECT Patient_ID,Event_ID,Diagnosis_Code,dig_number
    FROM #source
    ) x
    PIVOT
    (
    MAX(Diagnosis_Code)
    FOR Dig_Number IN (' + @cols + ')
    ) p ';
    EXECUTE(@query);
    

    如果再增加一列,即诊断数字,它将起作用。

    感謝Rajat Jaiswal、LeasMaps和Tim Biegeleisen的貢獻。 非常感谢。

    关键是建议在原始表中添加一个额外的列以用作转换后的表中的列标题。 事实证明,这相对容易做到(我在MS Excel中做到了这一点(。

    所以我的原始表格被编辑成这样...

    --------------------------------------------------------------
    Patient_ID |  Event_ID   | Diagnosis_Code | DiagCode_Counter |
    --------------------------------------------------------------
    Pers001    | HospStay001 |     C139       | Diagnosis_Code_1 |
    Pers001    | HospStay001 |     I245       | Diagnosis_Code_2 |
    Pers001    | HospStay001 |     D456       | Diagnosis_Code_3 |
    Pers001    | HospStay002 |     C139       | Diagnosis_Code_1 |
    Pers001    | HospStay002 |     J123       | Diagnosis_Code_2 |
    Pers555    | HospStay001 |     D312       | Diagnosis_Code_1 |
    Pers999    | HospStay001 |     C120       | Diagnosis_Code_1 |
    Pers999    | HospStay001 |     E101       | Diagnosis_Code_2 |
    --------------------------------------------------------------
    

    在新添加的"DiagCode_Counter">字段中,每次根据唯一的"Event_ID">记录新的Diagnosis_Code值时,数字后缀都会递增 1。

    然后,我能够在MS Access中创建交叉表查询,使用">Patient_ID">和"Event_ID"字段作为标题;"DiagCode_Counter"字段作为列标题;"Diagnosis_Code">条目作为

    > Rajat 是正确的 - 你需要某种列来用于创建diagnosis_column_1、dignosis_column_2......

    要在 ms-access 中执行此操作,我将: 1. 创建虚拟列以计算诊断列 2.使用VBA填充它(对于大型数据库更快(,如下所示

    Sub Update_Diagnosis_Code_ID()
    Dim db As DAO.Database
    'Dim qdf As DAO.QueryDef
    Dim rs As DAO.Recordset
    Dim TmpRecord As String
    Dim dummyId As Integer
    Dim patientID As String
    Dim eventID As String
    Dim lastDummyId As Integer
    Dim lastpatientID As String
    Dim lasteventID As String
    Dim i As Integer
    pstrSQL = "SELECT Inpat.Dummy_id, Inpat.Patient_id, Inpat.Event_ID, Inpat.Diagnosis_Code FROM Inpat ORDER BY Inpat.Patient_id, Inpat.Event_ID;"
    Set db = CurrentDb
    Set rs = db.OpenRecordset(pstrSQL)
    dummyId = 0
    
    With rs
    If Not .EOF Then
    'first record
    .MoveFirst
    patientID = rs.Fields(1) '
    eventID = rs.Fields(2) '
    .Edit
    rs.Fields(0) = dummyId + 1
    .Update
    .MoveNext
    Do While Not .EOF
    'store the values from the last record
    lastpatientID = patientID
    lasteventID = eventID
    'get the new values
    patientID = rs.Fields(1) '
    eventID = rs.Fields(2) '
    'new patient or new hospital stay
    If patientID <> lastpatientID Or eventID <> lasteventID Then
    dummyId = 0 'reset back to 1
    Else
    dummyId = dummyId + 1
    End If
    .Edit
    rs.Fields(0) = dummyId + 1
    .Update
    .MoveNext
    Loop
    End If
    End With
    rs.Close
    Set rs = Nothing
    Set dbs = Nothing
    MsgBox "Finished", vbExclamation
    End Sub
    

    然后,如果使用交叉表显示数据:

    TRANSFORM First(Inpat.[Diagnosis_Code]) AS FirstOfDiagnosis_Code
    SELECT Inpat.[Patient_id], Inpat.[Event_ID], Count(Inpat.[Diagnosis_Code]) 
    AS [Total Of Diagnosis_Code]
    FROM Inpat
    GROUP BY Inpat.[Patient_id], Inpat.[Event_ID]
    PIVOT Inpat.[Dummy_id];
    

    最新更新