SQL 通过基于相同"key"列合并行、合并不相同的"non-key"列来减少表



我有一个表,其中包含许多冗余信息,我想减少这些信息。从这个例子开始,我想从这个表开始:

+-------------+--------+----------+----------+------------------+-----------+------+
|  facility   | device | location | property |       name       | data_type | rate |
+-------------+--------+----------+----------+------------------+-----------+------+
| Chicago     | SVR2   |    501.1 | MAX_POW  | Maximum Power    | float     |   10 |
| Chicago     | SVR3   |    501.1 | MAX_POW  | Maximum Power    | float     |   10 |
| ...         |        |          |          |                  |           |      |
| Chicago     | SVR64  |    501.1 | MAX_POW  | Maximum Power    | float     |   10 |
| Chicago     | SVR1   |    501.1 | OS       | Operating System | int       |    0 |
| Minneapolis | SVR1   |       65 | MAX_POW  | Maximum Power    | float     |   10 |
+-------------+--------+----------+----------+------------------+-----------+------+

到这个表

+----------+--------+----------+----------+------------------+-----------+------+
| facility | device | location | property |       name       | data_type | rate |
+----------+--------+----------+----------+------------------+-----------+------+
| .*       | SVRd+ | .*       | MAX_POW  | Maximum Power    | float     |   10 |
| Chicago  | SVRd+ | .*       | OS       | Operating System | int       |    0 |
+----------+--------+----------+----------+------------------+-----------+------+

最有可能通过这个中间表:


+---------------------+---------------------+----------+----------+---------------+-----------+------+
|      facility       |       device        | location | property |     name      | data_type | rate |
+---------------------+---------------------+----------+----------+---------------+-----------+------+
| Chicago,Minneapolis | SVR2,SVR3,...,SVR64 | 501.1,65 | MAX_POW  | Maximum Power | float     |   10 |
| Minneapolis         | SVR1                |       65 | MAX_POW  | Maximum Power | float     |   10 |
+---------------------+---------------------+----------+----------+---------------+-----------+------+

因此,这里我考虑前3列facilitydevicelocation";非密钥";列和所有其他"列";键";colu女士。我想合并具有相同键列值的行,并为合并的行生成一个匹配所有现有非键列的正则表达式。我知道最后一个并不容易,所以我想为相同的键列获得一个非键列的所有值的联合集。

我想在SQLite中做这件事。我对如何获得具有相同键列的列有一个基本的想法,但我不知道如何获得非键列的联合值。如果做不到,我会写一个python脚本来做。

您很可能想要GROUP_CONCAT():

SELECT GROUP_CONCAT(DISTINCT facility) facility, 
GROUP_CONCAT(DISTINCT device) device, 
GROUP_CONCAT(DISTINCT location) location, 
property, name, data_type, rate
FROM tablename
GROUP BY property, name, data_type, rate

请参阅演示。

> facility            | device               | location   | property | name             | data_type | rate
> :------------------ | :------------------- | :--------- | :------- | :--------------- | :-------- | ---:
> Chicago,Minneapolis | SVR2,SVR3,SVR64,SVR1 | 501.1,65.0 | MAX_POW  | Maximum Power    | float     |   10
> Chicago             | SVR1                 | 501.1      | OS       | Operating System | int       |    0

相关内容

最新更新