我从包含许多重复电子邮件地址的csv文件中读取数据到临时表中。格式基本是id, emailtype-description, email。
下面是一些数据的例子:
id emailtype-description email
1 E-Mail john@gmail.com
1 preferred E-mail john@gmail.com
2 2nd E-mail stacey@yahoo.com
2 preferred-Email sth@yahoo.com
2 family E-Mail sth@yahoo.com
cInputFile = SUBSTITUTE(cDataDirectory, "Emails").
INPUT STREAM csv FROM VALUE(cInputFile).
IMPORT STREAM csv DELIMITER "," ^ NO-ERROR.
REPEAT TRANSACTION:
CREATE ttEmail.
IMPORT STREAM csv DELIMITER ","
ttEmail.uniqueid
ttEmail.emailTypeDescription
ttEmail.emailAddr
.
END.
INPUT STREAM csv CLOSE.
我想删除这些行,但我不想随机执行。我想确保某些类型优先于其他类型。例如,有些邮件被标记为"首选电子邮件"。如果存在的话,这些类型应该永远存在,其他类型优先于其他类型,所以"电子邮件";将优先于"第二封电子邮件"。或"家庭电子邮件"。
我想在进度代码中做一个等同于自定义的电子邮件类型描述,然后重复数据删除。这样我就可以定义排序顺序,然后根据优先级进行重复数据删除,以保留电子邮件和类型。
是否有办法做到这一点,我的表在进展?我想先按惟一类型排序,然后按电子邮件类型-描述排序,但是我想要自定义排序,而不是按字母顺序排序。最好的方法是什么?
当您说您想要自定义排序,而不是按字母排序时,您的意思是您想以非字母排序的方式按电子邮件类型排序吗?如果是这样,那么我认为您需要将电子邮件类型转换为您希望的排序方式的字段。下面的内容:
/* first add a field to your ttEmail called emailTypeSortOrder */
define variable emailTypeSortOrderList as character no-undo.
emailTypeSortOrderList = "preferred E-mail,E-mail,2nd-Email,family E-mail".
cInputFile = SUBSTITUTE(cDataDirectory, "Emails").
INPUT STREAM csv FROM VALUE(cInputFile).
IMPORT STREAM csv DELIMITER "," ^ NO-ERROR.
REPEAT TRANSACTION:
CREATE ttEmail.
IMPORT STREAM csv DELIMITER ","
ttEmail.uniqueid
ttEmail.emailTypeDescription
ttEmail.emailAddr
.
/* classify the email type sort order
*/
ttEmail.emailTypeSortOrder = lookup( emailTypeDescription, emailTypeSortOrderList ).
if ttEmail.emailTypeSortOrder <= 0 then emailTypeSortOrder = 9999999.
END.
INPUT STREAM csv CLOSE.
现在您可以使用新排序的字段进行排序和重复数据删除:
for each ttEmail break by ttEmail.emailAddr by ttEmail.emailTypeSortOrder:
if first-of( ttEmail.emailAddr ) then
next. /* always keep the first one */
else
delete ttEmail. /* remove unwanted duplicates... */
end.