如何在缓存中同步文件创建(不写入文件,只是创建)



我有一个存储旧文件的存储库,例如存档。
用户使用简单的Web应用程序获取这些文件。
我在我的Web应用程序正在运行的服务器上维护一个简单的文件系统缓存。
至少在一个想法时看起来很简单:)

我需要以一个仅一次线程允许从档案中获取相同文件的方式在该缓存中同步文件。

所有其他需要该文件的线程必须等到第一个线程将其写入缓存,然后从那里获取。
首先,我使用了file.exists()方法,但这是不好的,因为它在线程(锁所有者)创建一个空文件后立即返回true(因此它可以从存储库流开始写入它)。

我不确定这是正确的方法,但是我正在使用静态映射(映射file_id to syncdummyObject)来跟踪当前正在获取哪些文件。
然后我(尝试)同步文件在该SyncdummyObject上获取。

这是正确的方法吗?该代码正在工作,但是在我投入生产之前,我需要确保它的好。

我考虑使用登台目录,我在其中创建文件并在CACHE完成时将其传输到这些文件中,但这会打开另一组问题...

我删除了错误处理错误的日志记录和非相关部分,以获得更好的可读性。

谢谢!

public class RepoFileFetcher{
    private static volatile ConcurrentHashMap<String, Object> syncStrings = new ConcurrentHashMap<String, Object>();    
    private static final Object mapSync = new Object(); // map access sync
    private Boolean isFileBeingCreated = new Boolean(false);
    private Boolean isFileReadyInCache = new Boolean(false);

    public File getFileById(MikFileIdentifier cxfi){        
        File theFile = null; // file I'm going to return in the end
        try{
            Object syncObject = null;
            // sync map access
            synchronized(mapSync){
                if(syncStrings.containsKey(cxfi.getFilePath())){
                    // if the key exists in the map it means that
                    // it's being created by another thread
                    // fetch the object from the map 
                    // and use it to wait until file is created in cache
                    syncObject = syncStrings.get(cxfi.getFilePath());
                    isFileBeingCreated = true;

                }else if(!(new File(cxfi.getFilePath())).exists()){
                    // if it doesn't exist in map nor in cache it means that
                    // I'm the first one that fetches it from repo
                    // create new dummyLockObject and put it in the map
                    syncObject = new Object();
                    syncStrings.put(cxfi.getFilePath(), syncObject);
                }else{
                    // if it's not being created and exists in cache
                    // set flag so I can fetch if from the cache bellow
                    isFileReadyInCache = true;
                }
            }

            // potential problem that I'm splitting the critical section in half,
            // but I don't know how to avoid locking the whole fetching process
            // I want to lock only on the file that's being fetched, not fetching of all files (which I'd get if the mapSync was still locked)
            // What if, at this very moment, some other thread starts fetching the file and isFileBeingCreated becomes stale? Is it enough to check whether I succeeded renaming it and if not then fetch from cache? 

            if(!isFileBeingCreated && !isFileReadyInCache){
                // skip fetching from repo if another thread is currently fetching it
                // sync only on that file's map object
                synchronized(syncObject){
                    File pFile = new File(cxfi.getFilePath());
                    pFile.createNewFile();
                    // ...
                    // ... the part where I write to pFile from repo stream
                    // ...
                    if(!pFile.renameTo(theFile)){
                        // file is created by someone else 
                        // fetch it from cache
                        theFile = fetchFromCache(cxfi, syncObject);
                    }
                    syncStrings.remove(cxfi.getFilePath());
                    // notify all threads in queue that the file creation is over
                    syncObject.notifyAll();
                }//sync
            }else{
                theFile = fetchFromCache(cxfi, syncObject);
            }
            return theFile;

        }catch(...{
            // removed for better readability
        }finally{
            // remove from the map, otherwise I'll lock that file indefinitely
            syncStrings.remove(cxfi.getFilePath());
        }
        return null;
    }

    /**
     * Fetches the file from cache
     * @param cxfi File identification object
     * @param syncObject Used to obtain lock on file
     * @return File from cache
     * @throws MikFileSynchronizationException
     * @author mbonaci
     */
    private File fetchFromCache(FileIdentifier cxfi, Object syncObject)
            throws MikFileSynchronizationException{
        try{
            // wait till lock owner finishes creating the file
            // then fetch it from the cache
            synchronized(syncObject){   
                // wait until lock owner removes dummyObject from the map
                // while(syncStrings.containsKey(cxfi.getFilePath()))
                // syncObject.wait();                   
                File existingFile = new File(cxfi.getFilePath());
                if(existingFile.exists()){
                    return existingFile;
                }else{
                    // this should never happen
                    throw new MikFileSynchronizationException();
                }
            }
        }catch(InterruptedException ie){
            logger.error("Synchronization error", ie);
        }
        return null;
    }

编辑I:感谢大家的帮助。关于在CHM上使用putifabsent()的建议是关键。我最终这样做(欢迎任何其他评论):

编辑II:在类的其他if分支中添加了CHM元素删除(因为现在我将Elem放在地图中,即使我不需要)。

编辑III:移动上述文件存在的检查,在可变isFileInCache中。

public class RepoFileFetcher{               
    private static volatile ConcurrentHashMap<String, Object> syncStrings = new ConcurrentHashMap<String, Object>();    
    // save some time so I can lock syncObject earlier 
    private boolean isFileInCache = false;
    // remember whether we put the elem in the map or not 
    // so we know whether to remove it later 
    private boolean insertedMapElem = false; // added in EDIT II 
    /**
     * Fetches the file from repository (anc caches it) or directly from cache if available
     * @param cxfi File identification object
     * @return File
     * @author mbonaci
     */
    public File getFileById(FileIdentifier cxfi){
        String fileId = cxfi.getFileId();
        String fileName = cxfi.getOnlyPath() + fileId;
        File theFile = null; // file I'm going to return in the end
        try{
            Object syncObject = null;
            Object dummyObject = new Object();                  
            isFileInCache = (new File(fileName)).exists();
            syncObject = syncStrings.putIfAbsent(fileId, dummyObject);
            if(syncObject == null){ // wasn't in the map                            
                insertedMapElem = true; // we put the new object in
                if(!isFileInCache){ // not in cache
                    // if it doesn't exist in map nor in cache it means that
                    // I'm the first one that fetches it from repo (or cache was deleted)
                    // syncObject = new lock object I placed in the map
                    syncObject = dummyObject;
                    synchronized(syncObject){
                        File pFile = new File(cxfi.getFilePath());
                        pFile.createNewFile();
                        // ...
                        // ... the part where I write to pFile from repo stream
                        // ...
                        pFile.renameTo(theFile)
                        theFile = pFile;
                        syncStrings.remove(cxfi.getFilePath());
                        // notify all threads in queue that the file is now ready to be fetched from cache
                        syncObject.notifyAll();
                    }//sync
                }else{
                    // if it's not being created and exists in cache it means that it's complete
                    // fetch it from cache without blocking (only reading)
                    syncStrings.remove(cxfi.getFilePath()); // added in EDIT II
                    theFile = new File(fileName);
                }
            }else{
                // if the key exists in the map it means that
                // it's being created by another thread
                // fetch the object from the map 
                // and use it to wait until file is created in cache
                // don't touch the map (I haven't added anything)
                // the lock owner will remove the elem
                // syncObject = the object that what was in the map when I called putIfAbsent()
                theFile = fetchFromCache(cxfi, syncObject);
            }
            return theFile;
        }catch(...{
            // removed for better readability
        }finally{ 
            // no good cuz' this way I'd sometimes remove the map elem
            // while lock owner still writes to a file
            // only the one who placed the elem in the map should remove it
            // remove from the map, otherwise I'll lock that file indefinitely
            // syncStrings.remove(fileId); // commented out in EDIT II
        }
        // remove in case of exception (but only if we added it)
        if(insertedMapElem)
            syncStrings.remove(fileId);
        return null;
    }

    /**
     * Fetches the file from cache after it obtains lock on <code>syncObject</code>
     * @param cxfi File identification object
     * @param syncObject Used to obtain lock on file
     * @return File from cache
     * @author mbonaci
     */
    private File fetchFromCache(FileIdentifier cxfi, Object syncObject){
        String fileId = cxfi.getFileId();
        String fileName =  fileId + ".tif";
        synchronized(syncObject){
            File existingFile = new File(cxfi.getAbsPath() + fileName);
            if(existingFile.exists()){
                return existingFile;
            }
        }
    }

我可以建议一些调整:

  1. 您已经在使用ConcurrentHashMap,不需要额外的锁。
  2. 我会将"文件"包裹在具有自己的同步的智能对象中。因此,您可以做类似的事情:

    • 用路径和包装文件的"智能"对象在地图上调用putIfAbsent()
    • 以上将返回值(如果不存在的新值,或者现有包装器)
  3. 包装器中有状态,该状态知道它是否已经被缓存
  4. 调用 cache()检查是否已被缓存,如果是,则什么也不做,其他caches
  5. 然后从包装器(例如getFile()方法)返回"文件"

然后,请确保您在包装器内部使用锁来进行公共功能,这意味着一个同时进行cache()时会阻止。

这是一个草图:

class CachedFile
{
  File realFile;
  // Initially not cached
  boolean cached = false;
  // Construct with file
  public synchronized boolean isCached()
  { return cached; }
  public synchronized void cache()
  {
    if (!cached)
    {
      // now load - safe in the knowledge that no one can get the file (or cache())
      ..
      cached = true; // done
    }
  }
  public synchronized <File> getFile()
  {
    // return the "file"
  }
}

现在您的代码变成了:

ConcurrentHashMap<String, CachedFile> myCache = ConcurrentHashMap<>();
CachedFile newFile = new CachedFile(<path>);
CachedFile file = myCache.putIfAbsent(<path>, newFile);
// Use the new file if it did not exist
if (file == null) file = newFile;
// This will be no-op if already cached, or will block is someone is caching this file.
file.cache();
// Now return the cached file.
return file.getFile();

我的建议有意义吗?

相关内容

  • 没有找到相关文章

最新更新