正在获取从Azure Blob修改的最新文件



假设我每天在blob存储中生成几个json文件。我想做的是在我的任何目录中修改最新的文件。所以我的斑点里会有这样的东西:

2016/01/02/test.json
2016/01/02/test2.json
2016/02/03/test.json

我想要2016/02/03/test.json。因此,一种方法是获取文件的完整路径,并进行regex检查以查找创建的最新目录,但如果每个目录中有多个josn文件,则此方法不起作用。有类似File.GetLastWriteTime的方法可以获取最新修改的文件吗?我正在使用这些代码来获取所有文件btw:

public static CloudBlobContainer GetBlobContainer(string accountName, string accountKey, string containerName)
{
    CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials(accountName, accountKey), true);
    // blob client
    CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
    // container
    CloudBlobContainer blobContainer = blobClient.GetContainerReference(containerName);
    return blobContainer;
}
public static IEnumerable<IListBlobItem> GetBlobItems(CloudBlobContainer container)
{
    IEnumerable<IListBlobItem> items = container.ListBlobs(useFlatBlobListing: true);
    return items;
}
public static List<string> GetAllBlobFiles(IEnumerable<IListBlobItem> blobs)
{
    var listOfFileNames = new List<string>();
    foreach (var blob in blobs)
    {
        var blobFileName = blob.Uri.Segments.Last();
        listOfFileNames.Add(blobFileName);
    }
    return listOfFileNames;
}

每个IListBlobItem都将是CloudBlockBlob、CloudPageBlob或CloudBlobDirectory。

在转换为块或页面blob或其共享基类CloudBlob(最好使用as关键字并检查null)后,您可以通过blockBlob.Properties.LastModified访问修改后的日期。

请注意,您的实现将对容器中的所有Blob进行O(n)扫描,如果有数十万个文件,这可能需要一段时间。不过,目前还没有办法对blob存储进行更有效的查询(除非您滥用文件命名并以较新日期按字母顺序排列的方式对日期进行编码)。实际上,如果您需要更好的查询性能,我建议您随身携带一个数据库表,该表将所有文件列表表示为行,其中包括一个可搜索的索引DateModified列和一个具有blob路径的列,以便轻松访问文件。

更新(2022)微软现在似乎提供了可定制的Blob索引标签。这应该允许在blob元数据上添加自定义DateModified属性或类似属性;大于"/"小于";查询您的Blob而不需要单独的数据库。(注意:它显然只支持字符串值,因此对于日期值,您需要确保将它们保存为字典可排序的格式,如"yyyy-MM-dd"。)

正如Yar所说,您可以使用单个blob对象的LastModified属性。以下是一个代码片段,展示了在引用了正确的容器之后如何做到这一点:

var latestBlob = container.ListBlobs()
    .OfType<CloudBlockBlob>()
    .OrderByDescending(m => m.Properties.LastModified)
    .ToList()
    .First();

注意:blob类型可能不是<CloudBlockBlob>。如有必要,一定要更改。

       //connection string
        string storageAccount_connectionString = "**NOTE: CONNECTION STRING**";
        // Retrieve storage account from connection string.
        CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccount_connectionString);
        // Create the blob client.
        CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
        // Retrieve reference to a previously created container.
        CloudBlobContainer container = blobClient.GetContainerReference("**NOTE:NAME OF CONTAINER**");
        //The specified container does not exist
        try
        {
            //root directory
            CloudBlobDirectory dira = container.GetDirectoryReference(string.Empty);
            //true for all sub directories else false 
            var rootDirFolders = dira.ListBlobsSegmentedAsync(true, BlobListingDetails.Metadata, null, null, null, null).Result;
            foreach (var blob in rootDirFolders.Results)
            {
                if (blob is CloudBlockBlob blockBlob)
                {
                    var time = blockBlob.Properties.LastModified;
                    Console.WriteLine("Data", time);
                }
            }
        }
        catch (Exception e)
        {
            //  Block of code to handle errors
            Console.WriteLine("Error", e);
        }

对于新的V12 Nuget包,以前的答案已经过时。我使用以下指南来帮助从版本9升级到版本12https://elcamino.cloud/articles/2020-03-30-azure-storage-blobs-net-sdk-v12-upgrade-guide-and-tips.html

新的nuget包是Azure.Storage.Blobs,我使用了12.8.4版本的

以下代码将获取您最后一次修改的日期。您也可以编写此代码的异步版本。

using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using Azure.Storage.Blobs;
using Azure.Storage.Sas;
using Azure.Storage.Blobs.Specialized;
DateTimeOffset? GetLastModified()
{
    BlobServiceClient blobServiceClient = new BlobServiceClient("connectionstring")
    BlobContainerClient blobContainerClient = blobServiceClient.GetBlobContainerClient("blobname");
    BlobClient blobClient = blobContainerClient.GetBlobClient("file.txt");
    if (blobClient == null || !blobClient.Exists()) return null;
    DateTimeOffset lastModified = blobClient.GetProperties().Value.LastModified;
    return lastModified;
}

使用Azure Web Jobs SDK。SDK具有用于监视新的/更新的BLOB的选项。

如果出现问题,请使用blockBlob.Container.Properties.LastModified

使用Microsoft.Azure.Storage.Blob,您可以按如下方式获得它:

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading.Tasks;
using Microsoft.Azure.Storage;
using Microsoft.Azure.Storage.Blob;
namespace ListLastModificationOnBlob
{
    class Program
    {
        static void Main(string[] args)
        {
            MainAsync().Wait();
        }
        static async Task MainAsync()
        {
            string storageAccount_connectionString = @"Your connection string";
            // Retrieve storage account from connection string.
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccount_connectionString);
            // Create the blob client.
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
            var containers = await ListContainersAsync(blobClient);
            foreach (var container in containers)
            {
                Console.WriteLine(container.Name);
                try
                {
                    //root directory
                    CloudBlobDirectory dira = container.GetDirectoryReference(string.Empty);
                    //true for all sub directories else false 
                    var rootDirFolders = dira.ListBlobsSegmentedAsync(true, BlobListingDetails.Metadata, null, null, null, null).Result;
                    using (var w = new StreamWriter($"{container.Name}.csv"))
                    {
                        foreach (var blob in rootDirFolders.Results)
                        {
                            if (blob is CloudBlob blockBlob)
                            {
                                var time = blockBlob.Properties.LastModified;
                                var created = blockBlob.Properties.Created;
                                var line = $"{blockBlob.Name},{created},{time}";
                                await w.WriteLineAsync(line);
                                await w.FlushAsync();
                            }
                        }
                    }
                }
                catch (Exception e)
                {
                    //  Block of code to handle errors
                    Console.WriteLine("Error", e);
                }
            }
        }
        private static async Task<IEnumerable<CloudBlobContainer>> ListContainersAsync(CloudBlobClient cloudBlobClient)
        {
            BlobContinuationToken continuationToken = null;
            var containers = new List<CloudBlobContainer>();
            do
            {
                ContainerResultSegment response = await cloudBlobClient.ListContainersSegmentedAsync(continuationToken);
                continuationToken = response.ContinuationToken;
                containers.AddRange(response.Results);
            } while (continuationToken != null);
            return containers;
        }
    }
}

上述给定存储帐户的代码:

  • 获取帐户中的所有容器
  • take-all blob是容器
  • 将带有blob名称的CreatedLastModified保存在csv文件中(与容器一样命名)

使用rollsch和hbd的方法,我能够生成最新的图像,就像一样

public string File;
public async Task OnGetAsync()
{
    var gettingLastModified = _blobServiceClient
        .GetBlobContainerClient("images")
        .GetBlobs()
        .OrderByDescending(m => m.Properties.LastModified)
        .First();
    LatestImageFromAzure = gettingLastModified.Name;
    File = await _blobService.GetBlob(LatestImageFromAzure, "images");
}

我也在使用这些方法https://www.youtube.com/watch?v=B_yDG35lb5I&t=1864s

最新更新