如何从csv文件中获取值,其中一些单元格包含逗号



我有一个脚本,它导入一个csv文件并读取每一行以更新Sitecore中的相应项。它适用于许多产品,但问题是对于某些产品,行中的某些单元格中有逗号(例如产品描述)。

protected void SubmitButton_Click(object sender, EventArgs e)
{
    if (UpdateFile.PostedFile != null)
    {
        var file = UpdateFile.PostedFile;
        // check if valid csv file
        message.InnerText = "Updating...";
        Sitecore.Context.SetActiveSite("backedbybayer");
        _database = Database.GetDatabase("master");
        SitecoreContext context = new SitecoreContext(_database);
        Item homeNode = context.GetHomeItem<Item>();

        var productsItems =
            homeNode.Axes.GetDescendants()
                .Where(
                    child =>
                        child.TemplateID == new ID(TemplateFactory.FindTemplateId<IProductDetailPageItem>()));
        try
        {
            using (StreamReader sr = new StreamReader(file.InputStream))
            {
                var firstLine = true;
                string currentLine;
                var productIdIndex = 0;
                var industryIdIndex = 0;
                var categoryIdIndex = 0;
                var pestIdIndex = 0;
                var titleIndex = 0;
                string title;
                string productId;
                string categoryIds;
                string industryIds;
                while ((currentLine = sr.ReadLine()) != null)
                {
                    var data = currentLine.Split(',').ToList();
                    if (firstLine)
                    {
                        // find index of the important columns
                        productIdIndex = data.IndexOf("ProductId");
                        industryIdIndex = data.IndexOf("PrimaryIndustryId");
                        categoryIdIndex = data.IndexOf("PrimaryCategoryId");
                        titleIndex = data.IndexOf("Title");
                        firstLine = false;
                        continue;
                    }
                    title = data[titleIndex];
                    productId = data[productIdIndex];
                    categoryIds = data[categoryIdIndex];
                    industryIds = data[industryIdIndex];
                    var products = productsItems.Where(x => x.DisplayName == title);
                    foreach (var product in products)
                    {
                        product.Editing.BeginEdit();
                        try
                        {
                            product.Fields["Product Id"].Value = productId;
                            product.Fields["Product Industry Ids"].Value = industryIds;
                            product.Fields["Category Ids"].Value = categoryIds;
                        }
                        finally
                        {
                            product.Editing.EndEdit();
                        }
                    }
                }
            }
            // when done
            message.InnerText = "Complete";
        }
        catch (Exception ex)
        {
            message.InnerText = "Error reading file";
        }             
    }
}

问题是,当描述字段有逗号时,比如"产品是一种有效的预防性生物杀虫剂",它也会被拆分并丢弃索引,所以categoryIds = data[8]得到了错误的值。

电子表格是由我们的客户提供的数据,所以除非必要,否则我宁愿不要求客户编辑文件。有没有一种方法可以在我的代码中处理这个问题?有没有一种不同的方法可以读取文件,不会用逗号分隔所有内容?

我建议使用Ado.Net,如果字段的数据在引号内,它会像字段一样解析它,并忽略其中的任何逗号。。

代码示例:

static DataTable GetDataTableFromCsv(string path, bool isFirstRowHeader)
{
    string header = isFirstRowHeader ? "Yes" : "No";
    string pathOnly = Path.GetDirectoryName(path);
    string fileName = Path.GetFileName(path);
    string sql = @"SELECT * FROM [" + fileName + "]";
    using(OleDbConnection connection = new OleDbConnection(
              @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly + 
              ";Extended Properties="Text;HDR=" + header + """))
    using(OleDbCommand command = new OleDbCommand(sql, connection))
    using(OleDbDataAdapter adapter = new OleDbDataAdapter(command))
    {
        DataTable dataTable = new DataTable();
        dataTable.Locale = CultureInfo.CurrentCulture;
        adapter.Fill(dataTable);
        return dataTable;
    }
}

相关内容

  • 没有找到相关文章

最新更新