我编写了一个awk命令来消除.csv文件的重复数据。我正在运行Ubuntu 20.04。这是命令:
awk -F, ' {key = $2 FS} !seen[key]++' gigs.csv > try.csv
我不想一直打它,所以我在~/.bash_aliases中为它做了一个别名,如下所示:
alias dedupe="awk -F, ' {key = $2 FS} !seen[key]++' gigs.csv > try.csv"
然而,当我在终端中运行dedupe
时,它只生成一行,这与我键入完整命令时的结果不同。完整的命令会产生所需的结果我在别名方面犯了错误吗?为什么会发生这种情况,我该如何解决?
以下是来自原始.csv文件的示例:
Tue 30 Aug 08:34:17 AM,Do you use facebook? work remote from home. we are hiring!,https://atlanta.craigslist.org/atl/cpg/d/atlanta-do-you-use-facebook-work-remote/7527729597.html
Mon 29 Aug 03:51:29 PM,Cash for your opinions!,https://atlanta.craigslist.org/atl/cpg/d/atlanta-cash-for-your-opinions/7527517063.html
Mon 29 Aug 01:22:54 PM,Telecommute earn $20 per easy online product test gig w/ free products,https://montgomery.craigslist.org/cpg/d/hope-hull-telecommute-earn-20-per-easy/7527471859.html
Mon 29 Aug 01:53:58 PM,Telecommute earn $20 per easy online product test gig w/ free products,https://atlanta.craigslist.org/atl/cpg/d/smyrna-telecommute-earn-20-per-easy/7527456060.html
Mon 29 Aug 12:50:59 PM,Telecommute earn $20 per easy online product test gig w/ free products,https://bham.craigslist.org/cpg/d/adamsville-telecommute-earn-20-per-easy/7527454527.html
Wed 31 Aug 09:23:41 PM,Looking for a sales development rep,https://bham.craigslist.org/cpg/d/adamsville-looking-for-sales/7528472497.html
Wed 31 Aug 11:21:58 AM,Earn ~$30 | work from home | looking for 'ok google' users | taskverse,https://bham.craigslist.org/cpg/d/harbor-city-earn-30-work-from-home/7528233394.html
Mon 29 Aug 12:50:59 PM,Telecommute earn $20 per easy online product test gig w/ free products,https://bham.craigslist.org/cpg/d/adamsville-telecommute-earn-20-per-easy/7527454527.html
Wed 31 Aug 11:28:56 AM,Earn ~$30 | work from home | looking for 'ok google' users | taskverse,https://tuscaloosa.craigslist.org/cpg/d/harbor-city-earn-30-work-from-home/7528236901.html
Wed 31 Aug 11:27:53 AM,Earn ~$30 | work from home | looking for 'ok google' users | taskverse,https://montgomery.craigslist.org/cpg/d/harbor-city-earn-30-work-from-home/7528236389.html
I
使用单引号而不是双引号定义别名。单引号中没有什么特别的,所以像"... $2 ..."
这样的展开式被扩展到第二个位置参数的值时不会出现任何意外的问题。唯一的问题是,要包含内部单引号,您需要使用' ... ''' ... '
或' ... '"'"' ... '
中断引号
alias dedupe='awk -F, ''' {key = $2 FS} !seen[key]++''' gigs.csv > try.csv'
在这种情况下,函数可能更可取:
dedupe () { awk -F, ' {key = $2 FS} !seen[key]++' gigs.csv > try.csv; }