仅适用于从列表列表中的前两个列表工作的函数



我有此列表:

mylist = [
    [1890731350060, 'February 2016, March 2016, January 2016', 'INDEMNIZATIA DE HRANA', 1183], 
    [1890922350110, 'May 2015, June 2015, April 2015', 'INDEMNIZATIA DE HRANA', 1183], 
    [1890731350060, 'February 2016, March 2016, January 2016', 'INDEMNIZATIA DE HRANA', 1183]
]

我所需的输出:

mylist = [
    [1890731350060, 'Ian 2016, Feb 2016, Mar 2016', 'INDEMNIZATIA DE HRANA', 1183],
    [1890922350110, 'Iun 2016, Mai 2016, Apr 2016', 'INDEMNIZATIA DE HRANA', 1183],
    [1890731350060, 'Ian 2016, Feb 2016, Mar 2016', 'INDEMNIZATIA DE HRANA', 1183]
]

为此,我有这两个函数:

from datetime import datetime
import re
def translateInRo(string, dyct):
    substrs = sorted(dyct, key=len, reverse=True)
    regexp = re.compile('|'.join(map(re.escape, substrs)))
    return regexp.sub(lambda match: dyct[match.group(0)], string)
def orderDateslist(thislist):
    i=0
    for dates in thislist:
        sorted_list = []
        chgDates = dates[1].split(",")
        for test1 in chgDates:
            sorted_list.append(test1.strip())
        test = sorted(sorted_list, key=lambda x: datetime.strptime(x, "%B %Y"))
        str1 = ', '.join(test)
        translate = translateInRo(
            str1, {"January": "Ian", "February": "Feb", "March": "Mar", "April": "Apr", "May": "Mai", "June": "Iun", "July": "Iul", "August": "Aug", "September": "Sept", "October": "Oct", "November": "Nov", "December": "Dec"})
        thislist[i][1] = translate
        i = + 1
    return thislist

当我打印时:

print (orderDateslist(mylist))
[[1890731350060, 'Ian 2016, Feb 2016, Mar 2016', 'INDEMNIZATIA DE HRANA', 1183], [1890922350110, 'Ian 2016, Feb 2016, Mar 2016', 'INDEMNIZATIA DE HRANA', 1183], [1890731350060, 'February 2016, March 2016, January 2016', 'INDEMNIZATIA DE HRANA', 1183]]

最后一个列表不会计算,我只能从列表中的前2个列表中使用的功能,然后将其保持不变,我希望此功能适用于大量列表,我必须改变什么?我正在使用python3。也是最后一个正在复制。

问题

为了阐明问题,从您的预期代码中,您似乎希望替换每个子列表的索引1的字符串:

  1. 按时间分类日期
  2. 根据翻译词典缩写几个月

这可以如下完成:

# Given 
import datetime

mylist = [
    [1890731350060, 'February 2016, March 2016, January 2016', 'INDEMNIZATIA DE HRANA', 1183], 
    [1890922350110, 'May 2015, June 2015, April 2015',         'INDEMNIZATIA DE HRANA', 1183], 
    [1890731350060, 'February 2016, March 2016, January 2016', 'INDEMNIZATIA DE HRANA', 1183]
]
TRANSLATE = {
    "January": "Ian", "February": "Feb", "March": "Mar", "April": "Apr",
    "May": "Mai", "June": "Iun", "July": "Iul", "August": "Aug", 
    "September": "Sept", "October": "Oct", "November": "Nov", "December": "Dec"
}

代码

def transform_dates(iterable, translate=TRANSLATE):
    transformed_lists = []
    for i, sublst in enumerate(iterable):
        transformed_lists.append(sublst[:])
        # Clean dates string
        raw_dates = sublst[1]
        cleaned_dates = set(map(str.strip, raw_dates.split(",")))
        # Sort dates string
        months_yrs = sorted(cleaned_dates, key=lambda x: datetime.datetime.strptime(x, "%B %Y"))
        months_yrs_split = [i.split() for i in months_yrs]
        # Abbreviate months
        abbrev_dates = [" ".join((translate[i[0]], i[1])) for i in months_yrs_split]
        transformed_lists[i][1] = ", ".join(abbrev_dates)
    return transformed_lists
transform_dates(mylist)
# [[1890731350060, 'Ian 2016, Feb 2016, Mar 2016', 'INDEMNIZATIA DE HRANA',1183],
#  [1890922350110, 'Apr 2015, Mai 2015, Iun 2015', 'INDEMNIZATIA DE HRANA',1183],
#  [1890731350060, 'Ian 2016, Feb 2016, Mar 2016', 'INDEMNIZATIA DE HRANA',1183]]

注释

此功能按月和年度分类。

lst = [1890731350060, 'February 2015, March 2013, January 2016', 'INDEMNIZATIA DE HRANA', 1183], 
transform_dates(lst)
# [[1890731350060, 'Mar 2013, Feb 2015, Ian 2016', 'INDEMNIZATIA DE HRANA', 1183]]

此功能删除重复日期。

lst = [1890731350060, 'May 2016, June 2016, May 2016, July 2016', 'INDEMNIZATIA DE HRANA', 1183], 
transform_dates(lst)
# [[1890731350060,'Mai 2016, Iun 2016, Iul 2016', 'INDEMNIZATIA DE HRANA', 1183]]

详细信息

如果您是Python的新手,我会添加这些详细信息以帮助表达发生的事情。

transform_dates()函数接受称为mylist AS和参数的列表。在功能内部,我们首先制作一个名为transformed_lists的新列表,以后我们将将项目附加到。现在,我们在iterable(相当于mylist)上循环以获取每个sublist并跟踪其索引位置(i)。

我们将sublst的副本添加到transform_dates(因此[:],因为这使我们无法修改mylist中的原始项目)。然后,我们开始处理包含日期字符串的第一个索引。我们首先将其拆分为一个月对的列表,然后将其清除,然后将其分成一个月的列表,然后将strip拖延和领先空间,例如['February 2016', 'March 2016', 'January 2016']。如果有任何重复日期,则set()将它们删除,因为一组是唯一元素的集合。

为下一步做准备,我们借此机会对它们进行分类,并通过单个空间进行进一步的split。分裂使临时嵌套列表,例如[['January', '2016'], ['February', '2016'], ['March', '2016']]

最后,对于后一个嵌套列表中的每个项目,我们使用TRANSLATE字典缩写本月,并将join()与年份一起回归,并列出了一个新字符串的列表,例如。['Jan 2016', 'Feb 2016', 'Mar 2016']。然后,我们执行最终的join(),其中每个项目由逗号界定(根据要求),例如'Jan 2016, Feb 2016, Mar 2016'

我们已经完成了转换字符串。现在,我们只需通过将新字符串分配给该索引来替换我们transformed_lists的索引1的旧字符串。总而言之,我们已系统地选择了字符串,将其分解,对其进行了转换,将其放回原处,然后将其重新分配到列表中的原始位置。我们为iterable中的每个sublist重复此过程,直到循环完成为止。结果是我们的transformed_lists,该函数由函数返回。

您可以尝试以下方法:

import re
import itertools
def orderdates(full_date):
    table = {"January": "Ian", "February": "Feb", "March": "Mar", "April": "Apr", "May": "Mai", "June": "Iun", "July": "Iul", "August": "Aug", "September": "Sept", "October": "Oct", "November": "Nov", "December": "Dec"}
    l = ["Ian", "Feb", "Mar", "Apr", "Mai", "Iun", "Iul", "Aug", "Sept", "Oct", "Nov", "Dec"]
    new_dates = re.split(",s", full_date)
    final_dates = [[a, int(b)] for a, b in [i.split() for i in new_dates]]
    new_dates = sorted(final_dates, key = lambda x: x[-1])
    current = [list(b) for a, b in itertools.groupby(new_dates, lambda x: x[-1])]
    new_current = [[table[i]+" "+str(b) for i, b in c] for c in current]
   final_current = [sorted(b, key= lambda x:l.index(x.split()[0])) for b in new_current]
  return list(itertools.chain.from_iterable(final_current))

mylist = [[1890731350060, 'January 2016, February 2016, March 2015', 'INDEMNIZATIA DE HRANA', 1183], [1890922350110, 'May 2015, June 2015, April 2015', 'INDEMNIZATIA DE HRANA', 1183], [1890731350060, 'February 2016, March 2016, January 2016', 'INDEMNIZATIA DE HRANA', 1183]]
new_data = [[i[0], orderdates(i[1]), i[2:]] for i in mylist]
new_data = [list(itertools.chain(*[[b] if not isinstance(b, list) else b for b in i])) for i in new_data]
print(new_data)

输出:

[[1890731350060, 'Mar 2015', 'Ian 2016', 'Feb 2016', 'INDEMNIZATIA DE HRANA', 1183], [1890922350110, 'Apr 2015', 'Mai 2015', 'Iun 2015', 'INDEMNIZATIA DE HRANA', 1183], [1890731350060, 'Ian 2016', 'Feb 2016', 'Mar 2016', 'INDEMNIZATIA DE HRANA', 1183]]

最新更新