路径分析web数据



我正在对我的原始网络分析数据进行一些分析,并试图在我的网站上找到购买前的常见到达路径。我已经将所有数据迁移到下面一个整洁的表/脚本中:

  1. [ORDER_ID]:非唯一VARCHAR(A1000、A1001等)
  2. [VISIT_IN_PATH]:数字(1,2,3,4,5等)
  3. [VISIT_REMAINING]:数字(1,2,3,4,5等)
  4. [频道]:VARCHAR(直接/电子邮件/显示/付费搜索/免费搜索)

我希望得到的是路径的计数,所以例如,只使用ORDER_ID ABC123,我可能会得到以下内容:

计数/VISIT_IN_PATH_1/VISIT_IN-_PATH_2/VISIT_IN._PATH_3/VISIT_IN_PATH_4/VISIT_IN_PATH_51/免费搜索/免费搜索/直接/免费搜索直接

我对SQL还很陌生,我的第一个想法是某种支点,但每次我处理这个问题时,我都会遇到一堵必须定义每种可能性的墙。

肯定有更简单的方法吗?

CREATE TABLE [dbo].[DummyPaths](
                                    [ORDER_ID] [varchar](64) NULL,
                                    [VISIT_IN_PATH] [bigint] NULL,
                                    [VISIT_REMAINING] [bigint] NULL,
                                    [MARKETING_CHANNEL] [varchar](256) NULL
                                )
GO
INSERT INTO DummyPaths (ORDER_ID,VISIT_IN_PATH,VISIT_REMAINING,MARKETING_CHANNEL)
VALUES      ('ABC123','1','5','FREE SEARCH'),
            ('ABC123','2','4','FREE SEARCH'),
            ('ABC123','3','3','DIRECT'),
            ('ABC123','4','2','FREE SEARCH'),
            ('ABC123','5','1','DIRECT'),
            ('ABC124','1','5','OTHER REFERRAL'),
            ('ABC124','2','4','OTHER REFERRAL'),
            ('ABC124','3','3','OTHER REFERRAL'),
            ('ABC124','4','2','OTHER REFERRAL'),
            ('ABC124','5','1','OTHER REFERRAL'),
            ('ABC125','1','5','DIRECT'),
            ('ABC125','2','4','AFFILIATE'),
            ('ABC125','3','3','AFFILIATE'),
            ('ABC125','4','2','AFFILIATE'),
            ('ABC125','5','1','AFFILIATE'),
            ('ABC126','1','5','EMAIL'),
            ('ABC126','2','4','EMAIL'),
            ('ABC126','3','3','DIRECT'),
            ('ABC126','4','2','DIRECT'),
            ('ABC126','5','1','DIRECT'),
            ('ABC127','1','5','FREE SEARCH'),
            ('ABC127','2','4','DIRECT'),
            ('ABC127','3','3','DIRECT'),
            ('ABC127','4','2','FREE SEARCH'),
            ('ABC127','5','1','DIRECT'),
            ('ABC128','1','5','DIRECT'),
            ('ABC128','2','4','EMAIL'),
            ('ABC128','3','3','EMAIL'),
            ('ABC128','4','2','EMAIL'),
            ('ABC128','5','1','DIRECT'),
            ('ABC129','1','5','FREE SEARCH'),
            ('ABC129','2','4','FREE SEARCH'),
            ('ABC129','3','3','FREE SEARCH'),
            ('ABC129','4','2','FREE SEARCH'),
            ('ABC129','5','1','DIRECT')

如果有任何帮助,我将不胜感激。

感谢

我一直在玩这个,发现以下方法有效:

with        dataset
as
(
select  ORDER_ID
        ,[1] as 'VISIT1'
        ,[2] as 'VISIT2'
        ,[3] as 'VISIT3'
        ,[4] as 'VISIT4'
        ,[5] as 'VISIT5'                
from 
(
  select order_id, visit_in_path, MARKETING_CHANNEL
  from dummypaths
) x
pivot
(
  min(MARKETING_CHANNEL)
  for visit_in_path in ([1], [2], [3], [4], [5])
) p
) 

select      x.VISIT1
            ,x.VISIT2
            ,x.VISIT3
            ,x.VISIT4
            ,x.VISIT5
            ,MAX(dupes) as 'count'
from
(
select      
            VISIT1
            ,VISIT2
            ,VISIT3
            ,VISIT4
            ,VISIT5
            ,ROW_NUMBER() over 
                            (partition by    VISIT1
                                            ,VISIT2
                                            ,VISIT3
                                            ,VISIT4
                                            ,VISIT5
                            order by        VISIT1 asc
                            ) as 'dupes'
from dataset
) x
group by    x.VISIT1
            ,x.VISIT2
            ,x.VISIT3
            ,x.VISIT4
            ,x.VISIT5

最新更新