As it turns out, the Asynchronous File Iterator I wrote about way back in June 2013 has something of an issue. The big one being that it actually doesn’t work properly! Serves me right not testing it on real data. I actually discovered this issue when I returned to it to try to change it to use C# 6.0 features, and found that it didn’t work as is anymore (if at all).
I also noticed that it was- to be honest, poorly designed. I think I may have rushed through it to get it into a blog post without fully considering the design decisions. This became apparent as I watched the search results trickle in while there were over 4 threads running. The problem was that every directory search would be a new thread, which included, of course, those subdirectory searches performed by a subdirectory search.
I decided to return to the drawing board and try to redesign it in a more sane manner. Since this is a fairly straightforward program/implementation it seemed easier for me to do it this way than figure out how to fix the original, particularly since the original appeared to have been rushed to meet some imagined posting deadline.
The main purpose of the asynchronous search is to try to increase performance over a fully synchronous search. With most Hard Disks, reading directory information from multiple locations won’t significantly impact performance, particularly with an SSD. For the moment I’ve decided on two threads maximum. Due to the issues with the original, I’ve decided to have a limitation- threads will only be started for the directories in the initial search directory, and any search in a directory lower down (higher up?) the directory tree will instead be performed synchronously. Naturally, our friend FindFirstFile() and co. appear. The way I implemented it doesn’t use the Mask parameter to FindFirstFile in order to allow only a single directory traversal to grab directories as well as files, and then the files are filtered based on the provided mask using the Like functionality I described in a previous blog post. (Interestingly, I wrote about it twice- in the linked post, as well as here. It’s pretty obvious you’ve written a lot of content when you find you wrote about things twice without realizing it.)
In my mind, the idea behind the class is to be a “lower level” class that will be consumed by a more advanced Search class, with the advanced search class supporting things like Advanced Filters and perhaps Actions that can be taken on each result. The more “low-level” class would expose two events. An event fired when a search completes, and an event fired when an item is found.
The “old” class I discussed in the previous post used a design that managed state via fields in the class. In my redesign I’ll “cheat” and make sure the actual search function itself is synchronous, and provide asynchronous behaviour via a wrapper. I should mention that I actually use the ‘Thread’ model, rather than an async routine. I admit this is because I’ve yet to fully grok async. The fact that my work requires .NET Framework 4 and thus we cannot use the async stuff (meaning I don’t get any experience with it through my work) doesn’t help. On the other hand- this does mean this code will work in older versions of C#. At any rate, since we will be dealing with events, I suppose we ought to define them first:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
public class AsyncFileFindCompleteEventArgs : EventArgs { public enum CompletionCauseEnum { Complete_Success, Complete_Cancelled } private CompletionCauseEnum _completionCause = CompletionCauseEnum.Complete_Success; public CompletionCauseEnum CompletionCause { get { return _completionCause; } set { _completionCause = value; } } public AsyncFileFindCompleteEventArgs(CompletionCauseEnum CompletionType) { _completionCause = CompletionType; } //fired when a FileFind operation completes. } public class AsyncFileFoundEventArgs : EventArgs { private FileSearchResult _Result = null; public FileSearchResult Result { get { return _Result; } } public AsyncFileFoundEventArgs(FileSearchResult result) { _Result = result; } } |
The Completion event will contain a property to indicate “how” it completed. Alternatively, it could use distinct events, or even use a event heirarchy (eg AsyncFileFindCompletion event being an abstract base class with a “AsyncFileFindCancelled” subclass and a “AsyncFileFindFinished” subclass.). Cancelling is one consideration that is pretty important- we want to be able to cancel the search at any time by calling a Cancel method. Calling the Cancel method should guarantee that, after you call it, you will not receive any events from files being found from that search.
Similarly, since this will be asynchronous and the caller will be given control right away, we’ll want to have some sort of re-entrancy protection. the Start method will detect if a search is already in progress, and, if so, it should raise this exception:
1 2 3 4 5 6 7 8 9 10 11 |
public class SearchAlreadyInProgressException : Exception { public SearchAlreadyInProgressException(SerializationInfo info,StreamingContext context):base(info,context) { } public SearchAlreadyInProgressException():base("AsyncFileFinder Search Already started") { } } |
And, the main class itself:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 |
public class AsyncFileFinder { public delegate bool FilterDelegate(FileSearchResult fsearch); private bool _Cancelled = false; private bool isChild = false; private String _SearchDirectory = ""; private String _SearchMask = "*"; private FilterDelegate FileFilter = null; private FilterDelegate DirectoryRecursionFilter = null; private Thread SearchThread = null; private bool _IsSearching = false; private ConcurrentQueue<FileSearchResult> FoundElements = new ConcurrentQueue<FileSearchResult>(); public event EventHandler<AsyncFileFindCompleteEventArgs> AsyncFileFindComplete; public event EventHandler<AsyncFileFoundEventArgs> AsyncFileFound; public String SearchDirectory { get { return _SearchDirectory; } set { _SearchDirectory = value; } } public String SearchMask { get { return _SearchMask; } set { _SearchMask = value; } } public void Cancel() { lock (ChildDirectorySearchers) { foreach (var iterate in ChildDirectorySearchers) { iterate.Cancel(); } } _IsSearching = false; _Cancelled = true; } private void FireAsyncFileFound(AsyncFileFoundEventArgs e) { lock (this) { var copied = AsyncFileFound; if (copied != null) copied(this, e); } } private void FireAsyncFileFindComplete(AsyncFileFindCompleteEventArgs.CompletionCauseEnum completionCauseEnum) { var copied = AsyncFileFindComplete; if (copied != null) copied(this, new AsyncFileFindCompleteEventArgs(completionCauseEnum)); } public void FireAsyncFileFindComplete() { FireAsyncFileFindComplete(AsyncFileFindCompleteEventArgs.CompletionCauseEnum.Complete_Success); } public bool IsSearching { get { return _IsSearching; } } public bool HasResults() { return !FoundElements.IsEmpty; } public FileSearchResult GetNextResult() { FileSearchResult ResultItem = null; while (!FoundElements.TryDequeue(out ResultItem)) { } return ResultItem; } public AsyncFileFinder(String pSearchDirectory, String pSearchMask, FilterDelegate pFileFilter = null, FilterDelegate pDirectoryRecursionFilter = null, bool pIsChild = false) { _SearchDirectory = pSearchDirectory; _SearchMask = pSearchMask; FileFilter = pFileFilter; DirectoryRecursionFilter = pDirectoryRecursionFilter; isChild = pIsChild; } public void Start() { //let's default our Filters if they aren't provided. if (FileFilter == null) FileFilter = ((s) => true); if (DirectoryRecursionFilter == null) DirectoryRecursionFilter = ((s) => true); if (IsSearching) throw new SearchAlreadyInProgressException(); //alright, we want to do this asynchronously, so start StartInternal on another thread. _IsSearching = true; SearchThread = new Thread(StartSync); SearchThread.Start(); } private bool FitsMask(string fileName, string fileMask) { return fileName.Like(fileMask); } private List<AsyncFileFinder> ChildDirectorySearchers = new List<AsyncFileFinder>(); private int MaxChildren = 2; public void StartSync() { ChildDirectorySearchers = new List<AsyncFileFinder>(); Debug.Print("StartSync Called, Searching in " + _SearchDirectory + " For Mask " + _SearchMask); String sSearch = Path.Combine(_SearchDirectory, "*"); Queue<String> Directories = new Queue<string>(); //Task: //First, Search our folder for matching files and add them to the queue of results. Debug.Print("Searching for files in folder"); NativeMethods.WIN32_FIND_DATA FindData; IntPtr fHandle = NativeMethods.FindFirstFile(sSearch, out FindData); while (fHandle != IntPtr.Zero) { if (_Cancelled) { FireAsyncFileFindComplete(AsyncFileFindCompleteEventArgs.CompletionCauseEnum.Complete_Cancelled); return; } //if the result is a Directory, add it to the list of result directories if it passes the recursion test. if ((FindData.dwFileAttributes & FileAttributes.Directory) == FileAttributes.Directory) { if (FindData.Filename != "." && FindData.Filename != "..") if (DirectoryRecursionFilter(new FileSearchResult(FindData, Path.Combine(sSearch, FindData.Filename)))) { Debug.Print("Found Directory:" + FindData.Filename + " Adding to Directory Queue."); Directories.Enqueue(FindData.Filename); } } else if (FindData.Filename.Length > 0) { //make sure it matches the given mask. if (FitsMask(FindData.Filename, _SearchMask)) { FileSearchResult fsr = new FileSearchResult(FindData, Path.Combine(_SearchDirectory, FindData.Filename)); if (FileFilter(fsr) && !_Cancelled) { Debug.Print("Found File " + fsr.FullPath + " Raising Found event."); FireAsyncFileFound(new AsyncFileFoundEventArgs(fsr)); } } } FindData = new NativeMethods.WIN32_FIND_DATA(); if (!NativeMethods.FindNextFile(fHandle, out FindData)) { Debug.Print("FindNextFile returned False, closing handle..."); NativeMethods.FindClose(fHandle); fHandle = IntPtr.Zero; } } //find all directories in the search folder which also satisfy the Recursion test. //Construct a new AsyncFileFinder to search within that folder with the same Mask and delegates for each one. //Allow MaxChildren to run at once. When a running filefinder raises it's complete event, remove it from the List, and start up one of the ones that have not been run. //if isChild is true, we won't actually multithread this task at all. Debug.Print("File Search completed. Starting search of " + Directories.Count + " directories found in folder " + _SearchDirectory); while (Directories.Count > 0 || ChildDirectorySearchers.Count > 0) { if (_Cancelled) { break; } while (ChildDirectorySearchers.Count >= MaxChildren) Thread.Sleep(5); //add enough AsyncFileFinders to the ChildDirectorySearchers bag to hit the MaxChildren limit. if (Directories.Count == 0) { Debug.Print("No directories left. Waiting for Child Search instances to complete."); Thread.Sleep(5); continue; } Debug.Print("There are " + ChildDirectorySearchers.Count + " Searchers active. Starting more."); String startchilddir = Directories.Dequeue(); startchilddir = Path.Combine(_SearchDirectory, startchilddir); AsyncFileFinder ChildSearcher = new AsyncFileFinder(startchilddir, _SearchMask, FileFilter, DirectoryRecursionFilter, true); ChildSearcher.AsyncFileFound += (senderchild, foundevent) => { AsyncFileFinder source = senderchild as AsyncFileFinder; if (!_Cancelled) FireAsyncFileFound(foundevent); }; ChildSearcher.AsyncFileFindComplete += (ob, ev) => { AsyncFileFinder ChildSearch = (AsyncFileFinder) ob; lock (ChildDirectorySearchers) { Debug.Print("Child Searcher " + ChildSearch.SearchDirectory + " issued a completion event, removing from list."); ChildDirectorySearchers.Remove(ChildSearch); } }; ChildDirectorySearchers.Add(ChildSearcher); if (!isChild) { Debug.Print("Starting sub-search asynchronously"); ChildSearcher.Start(); } else { Debug.Print("Starting sub-search synchronously"); ChildSearcher.StartSync(); } } Debug.Print("Exited Main Search Loop: Queue:" + Directories.Count + " Child Searchers:" + ChildDirectorySearchers.Count); _IsSearching = false; FireAsyncFileFindComplete (_Cancelled ? AsyncFileFindCompleteEventArgs.CompletionCauseEnum.Complete_Cancelled : AsyncFileFindCompleteEventArgs.CompletionCauseEnum.Complete_Success); } } |
OK so it ended up a bit longer than I originally expected. The Debug statements use the Debug class I discussed Previously. I’ve plonked the whole thing into a new FileIterator repository. This was totally not because I couldn’t figure out how to change the existing AsyncFileIterator repository to accept my changes, that was just a coincidence.
Now that it actually works, I’ll be able to properly revise it to take advantage of C# 5.0 and CD 6.0 features and discuss those improvements to the project in a later post.
Have something to say about this post? Comment!