你能帮我在SQL(MS SQL Server 2017(中解决下面的任务吗。它在Excel中很简单,但在SQL中似乎很复杂。
有一张表格,其中列出了按天数划分的客户及其活动:
client 1may 2may 3may 4may 5may other days
client1 0 0 0 0 0 ...
client2 0 0 0 0 0 ...
client3 0 0 0 0 0 ...
client4 1 1 1 1 1 ...
client5 1 1 1 0 0 ...
有必要创建相同的表(相同数量的行和列(,但根据规则将值转换为新值:当日值=
A( 如果当天前一周的所有日常值,包括当前值=1,则1
B( 如果当天前一周的所有日常值,包括当前值=0,则0
C( 如果值不同,则我们保留前一天的状态(如果前一天状态未知,例如,客户端是新的,则为0(
在Excel中,我使用公式:=IF(AND(AF2=AE2;AE2=AD2;AD2=AC2;AC2=AB2;AB2=AA2;AA2=Z2(;current_day_value;IF(previous_day_value="0"previous_day-value((。
附excel文件示例。
非常感谢。
首先,将日期作为列从来都不是一个好主意。
因此,步骤#1将列转换为行。在其他世界中,构建一个具有三列的表
```
client date Value
client1 May1 0
client1 May2 0
client1 May3 0
.... ... ..
client4 May1 1
client4 May2 1
client4 May3 1
.... ... ..
```
步骤#2使用日期字段执行所有需要的计算。
基本上,您将始终置于前一天的状态,在任何情况下(null除外(。
因此,我会做这样的事情(oracle语法,也在sql server中工作(,假设第一列是1may
Insert into newTable (client, 1may,2may,....) select (client, 0, coalesce(1may,0), coalesce (2may,0), .... from oldTable;
不管怎样,我也认为把日子作为关系表的列不是一个好的做法。
您将很难解决这个问题,因为大多数SQL品牌都不允许"任意枢转";,也就是说,你需要指定你想在数据透视表上显示的列——而Excel只会为你做这件事。SQL可以做到这一点,但它需要动态SQL,这可能会变得非常复杂,而且很快就会令人讨厌。
我建议您只使用sql来构建数据,然后使用excel或SSRS(就像您在TSQL中一样(来实际进行可视化。
不管怎样。我认为这是你想要的:
WITH Data AS (
SELECT * FROM (VALUES
('Client 1',CONVERT(DATE, '2020-05-04'),1)
, ('Client 1',CONVERT(DATE, '2020-05-05'),1)
, ('Client 1',CONVERT(DATE, '2020-05-06'),1)
, ('Client 1',CONVERT(DATE, '2020-05-07'),0)
, ('Client 1',CONVERT(DATE, '2020-05-08'),0)
, ('Client 1',CONVERT(DATE, '2020-05-09'),0)
, ('Client 1',CONVERT(DATE, '2020-05-10'),1)
, ('Client 1',CONVERT(DATE, '2020-05-11'),1)
, ('Client 1',CONVERT(DATE, '2020-05-12'),1)
, ('Client 2',CONVERT(DATE, '2020-05-04'),1)
, ('Client 2',CONVERT(DATE, '2020-05-05'),0)
, ('Client 2',CONVERT(DATE, '2020-05-06'),0)
, ('Client 2',CONVERT(DATE, '2020-05-07'),1)
, ('Client 2',CONVERT(DATE, '2020-05-08'),0)
, ('Client 2',CONVERT(DATE, '2020-05-09'),1)
, ('Client 2',CONVERT(DATE, '2020-05-10'),0)
, ('Client 2',CONVERT(DATE, '2020-05-11'),1)
) x (Client, RowDate, Value)
)
SELECT
Client
, RowDate
, Value
, CASE
WHEN OnesBefore = DaysInWeek THEN 1
WHEN ZerosBefore = DaysInWeek THEN 0
ELSE PreviousDayValue
END As FinalCalculation
FROM (
-- This set uses windowing to calculate the intermediate values
SELECT
*
-- The count of the days present in the data, as part of the week may be missing we can't assume 7
-- We only count up to this day, so its in line with the other parts of the calculation
, COUNT(RowDate) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate) AS DaysInWeek
-- Count up the 1's for this client and week, in date order, up to (and including) this date
, COUNT(IIF(Value = 1, 1, NULL)) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate) AS OnesBefore
-- Count up the 0's for this client and week, in date order, up to (and including) this date
, COUNT(IIF(Value = 0, 1, NULL)) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate) AS ZerosBefore
-- get the previous days value, or 0 if there isnt one
, COALESCE(LAG(Value) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate), 0) AS PreviousDayValue
FROM (
-- This set adds a few simple values in that we can leverage later
SELECT
*
, DATEADD(DAY, -DATEPART(DW, RowDate) + 1, RowDate) As WeekCommencing
FROM Data
) AS DataWithExtras
) AS DataWithCalculations
由于您还没有指定表布局,我不知道在我的示例中使用什么表和字段名称。希望如果这是正确的,你可以找到如何用你所拥有的点击它-如果不是,请留下评论
我也会注意到,我故意把这个写得很冗长。如果你不知道";OVER";子句是,您需要进行一些阅读:https://www.sqlshack.com/use-window-functions-sql-server/.要点是,它们进行聚合时并没有真正将行压缩在一起。
编辑:调整计算,以便能够考虑一周中任意天数的
非常感谢大家,尤其是David和Massimo,他们促使我重组了数据。
--we join clients and dates each with each and label clients with 'active' or 'inactive'
with a as (
select client, dates
from (select distinct client from dbo.clients) a
cross join (select dates from dates) b
)
, b as (
select date
,1 end active
,client
from clients a
join dbo.dates b on a.id = b.id
)
select client
,a.dates
,isnull(b.active, 0) active
into #tmp2
from a
left join b on a.client= b.client and a.dates = b.dates
--declare variables - for date start and for loop
declare @min_date date = (select min(dates) from #tmp2);
declare @n int = 1
declare @row int = (select count(distinct dates) from #tmp2) --number of the loop iterations
--delete data from the final results
delete from final_results
--fill the table with final results
--run the loop (each iteration = analyse of each 1-week range)
while @n<=@row
begin
with a as (
--run the loop
select client
,max(dates) dates
,sum (case when active = 1 then 1 else null end) sum_active
,sum (case when active = 0 then 1 else null end) sum_inactive
from #tmp2
where dates between dateadd(day, -7 + @n, @min_date) and dateadd(day, -1 + @n, @min_date)
group by client
)
INSERT INTO [dbo].[final_results]
(client
,[dates]
,[final_result])
select client
,dates
,case when sum_active = 7 then 1 --rule A
when sum_inactive = 7 then 0 -- rule B
else
(case when isnull(sum_active, 0) + isnull(sum_inactive, 0) < 7 then 0
else
(select final_result
from final_results b
where b.dates = dateadd(day, -1, a.dates)
and a.client= b.client) end
) end
from a
set @n=@n+1
end
if object_id(N'tempdb..#tmp2', 'U') is not null drop table #tmp2