在文件名(Yyyy/mm/dd)中找到日期/财务季度,然后将文件名中的日期更改为yyyy/mm/dd 3个月



问题:有100个以上的表可以结合。名称的唯一区别是日期。与其编写100多个代码块,仅更改日期,而是编写Python脚本以输出SQL代码更有意义:

我们的SQL查询,查询一张桌子(即一个财务季度):

SELECT 
  (SELECT
      PCR.repdte
      FROM 
      All_Reports_19921231_Performance_and_Condition_Ratios as 
      PCR) AS Quarter,
  (SELECT 
      Round(AVG(PCR.lnlsdepr))
      FROM 
      All_Reports_19921231_Performance_and_Condition_Ratios as 
      PCR) AS NetLoansAndLeasesToDeposits,
  (SELECT sum(CAST(LD.IDdepsam as int))
      FROM 
     'All_Reports_19921231_Deposits_Based_on_the_Dollars250,000
     _Reporting_Threshold' AS LD) AS 
     DepositAccountsWith$LessThan$250k

每个文件名的命名约定包括日期(财务季度)

All_Reports_19921231_Performance_and_Condition_Ratios
All_Reports_19921231_Performance_and_Condition_Ratios
All_Reports_19921231_Deposits_Based_on_the_Dollars250,000
_Reporting_Threshold

我们想查询包括19921231在内的所有财务季度

    19921231
    19930331
    19930630 
    19930930
    19931231
    19940331
    19940630
    19940930
    19941231
    19950331
    19950630
    19950930
    19951231
     …..
     ….
    20180930

脚本将:

Step one:  find the yyyy/mm/dd in the file name (e.g. 19921231)
Step two:  copy the query
Step three: change the yyyy/mm/dd in the copied file name(s)
IF 1231 change to “+1”0331   (e.g. 19921231 to 19930331)
IF 0331 change to 0630       (e.g. 19930331 to 19930630)
IF 0630 change to 0930       (e.g. 19930630 to 19930930)
IF 0930 change to 1231       (e.g. 19930930 to 19931231)
IF 1231 change to +1 0331    (e.g. 19931231 to 19940331)
…..
…..
…..
IF 91231 change to 00331  (e.g. 19991231 to 20000331)
….
IF 91231 change to 0031 (e.g. 20091231 to 20100331) 
Step four: print new code block after UNION ALL 
Step five: repeat step three
Step six: repeat step four

输入将是一个财务季度(请参见上面的代码块),输出为,该代码块重复100次以上,并且每个文件名中只有yyyy/mm/dd更改。每个代码块将与一个联盟一起加入:

SELECT 
  (SELECT
      PCR.repdte
      FROM 
      All_Reports_19921231_Performance_and_Condition_Ratios as 
      PCR) AS Quarter,
  (SELECT 
      Round(AVG(PCR.lnlsdepr))
      FROM 
      All_Reports_19921231_Performance_and_Condition_Ratios as 
      PCR) AS NetLoansAndLeasesToDeposits,
 (SELECT sum(CAST(LD.IDdepsam as int))
      FROM 
     'All_Reports_19921231_Deposits_Based_on_the_Dollars250,000
      _Reporting_Threshold' AS LD) AS 
      DepositAccountsWith$LessThan$250k
UNION ALL
SELECT 
  (SELECT
      PCR.repdte
      FROM 
      All_Reports_19930330_Performance_and_Condition_Ratios as 
      PCR) AS Quarter,
   (SELECT 
     Round(AVG(PCR.lnlsdepr))
     FROM All_Reports_19930330_Performance_and_Condition_Ratios 
     as PCR) AS NetLoansAndLeasesToDeposits,
   (SELECT sum(CAST(LD.IDdepsam as int))
     FROM 
    'All_Reports_19930330_Deposits_Based_on_the_Dollars250,000
     _Reporting_Threshold' AS LD) AS 
     DepositAccountsWith$LessThan$250k

第一步是编写一个生成器,该发电机从给定的日期字符串以三个月的增量产生日期字符串到现在。我们可以将初始日期存储在datetime对象中,并在每个步骤中使用该日期来生成新的日期字符串,并防止日期通过当前。然后,我们可以使用calendar.monthrange找到给定月份的最后几天。

from datetime import datetime
from calendar import monthrange
def dates_from(start):
    date  = datetime.strptime(start, "%Y%m%d")
    today = datetime.today()
    while date < today:
        yield date.strftime("%Y%m%d")
        month = date.month + 3
        year = date.year
        if month > 12:
            year += 1
            month -= 12
        _, day = monthrange(year, month)
        date = datetime(year, month, day)

然后,我们可以使用字符串格式将此值注入模板字符串

sql_template = """
SELECT 
  (SELECT
      PCR.repdte
      FROM 
      All_Reports_{0}_Performance_and_Condition_Ratios as 
      PCR) AS Quarter,
  (SELECT 
      Round(AVG(PCR.lnlsdepr))
      FROM 
      All_Reports_{0}_Performance_and_Condition_Ratios as 
      PCR) AS NetLoansAndLeasesToDeposits,
  (SELECT sum(CAST(LD.IDdepsam as int))
      FROM 
     'All_Reports_{0}_Deposits_Based_on_the_Dollars250,000
     _Reporting_Threshold' AS LD) AS 
     DepositAccountsWith$LessThan$250k"""
queries = map(sql_template.format, dates_from("19921231"))
output_string = "nUNION ALLn".join(queries)

最新更新