单独计算表中的所有列

尝试在表上进行一些数据探索，以单独计算行中所有列和所有唯一字符串，如以下输出:

+------------+-------------+------------+----------+
| table_name | column_name | distinct   | count(*) |
|            |             | row_string |          |
+------------+-------------+------------+----------+
| customer   | state       | WA         | 15       |
+------------+-------------+------------+----------+
| customer   | state       | NSW        | 786      |
+------------+-------------+------------+----------+
| customer   | state       | SA         | 51       |
+------------+-------------+------------+----------+
| ...        | ...         | ...        | ...      |
+------------+-------------+------------+----------+
| customer   | zip_code    | 3563       | 33       |
+------------+-------------+------------+----------+
| customer   | zip_code    | 7583       | 52       |
+------------+-------------+------------+----------+
| customer   | zip_code    | 3453       | 553      |
+------------+-------------+------------+----------+
| customer   | zip_code    | 2132       | 211      |
+------------+-------------+------------+----------+
| ...        | ...         | ...        | ...      |
+------------+-------------+------------+----------+

我一直在做这样的事情:

select state, count(*)
from customer
group by state
union
select zip_code, count(*)
from customer
group by zip_code
union
...

但是，如果您的表中有成堆的列，那么这是不有效的。有没有更有效的方法来实现这一点?

(下面是在我看到Martin Smith在上面评论中发布的链接之前发布的。使用XML的方法看起来更简单，并且可能具有大致相似的性能。)

您可以尝试生成如下动态SQL:

select
cast(N'customer' AS sysname) AS table_name,
C.column_name,
C.value,
count(*) AS [count]
from [customer] T
cross apply (
values
(cast(N'state' AS sysname), cast(T.[state] AS nvachar(max)),
(cast(N'zip_code' AS sysname), cast(T.[zip_code] AS nvachar(max))
) C(column_name, value)
group by column_name, value

CAST()操作可能看起来有点过分，但是如果较短的数据后面跟着较长的数据，它们可能是必要的，以防止截断。对于某些数据类型，如DATETIME或FLOAT，您可能希望在格式化中更精确，甚至可能有意限制多余的精度。

当向生成的SQL中注入表名和列名时，使用QUOTENAME(...)和QUOTENAME(..., '''')安全地[quote]或'quote'注入的值

相关内容

最新更新

热门标签：