Menu

A File Iterator that actually works

May 24, 2015 - .NET, C#

As it turns out, the Asynchronous File Iterator I wrote about way back in June 2013 has something of an issue. The big one being that it actually doesn’t work properly! Serves me right not testing it on real data. I actually discovered this issue when I returned to it to try to change it to use C# 6.0 features, and found that it didn’t work as is anymore (if at all).

I also noticed that it was- to be honest, poorly designed. I think I may have rushed through it to get it into a blog post without fully considering the design decisions. This became apparent as I watched the search results trickle in while there were over 4 threads running. The problem was that every directory search would be a new thread, which included, of course, those subdirectory searches performed by a subdirectory search.

I decided to return to the drawing board and try to redesign it in a more sane manner. Since this is a fairly straightforward program/implementation it seemed easier for me to do it this way than figure out how to fix the original, particularly since the original appeared to have been rushed to meet some imagined posting deadline.

The main purpose of the asynchronous search is to try to increase performance over a fully synchronous search. With most Hard Disks, reading directory information from multiple locations won’t significantly impact performance, particularly with an SSD. For the moment I’ve decided on two threads maximum. Due to the issues with the original, I’ve decided to have a limitation- threads will only be started for the directories in the initial search directory, and any search in a directory lower down (higher up?) the directory tree will instead be performed synchronously. Naturally, our friend FindFirstFile() and co. appear. The way I implemented it doesn’t use the Mask parameter to FindFirstFile in order to allow only a single directory traversal to grab directories as well as files, and then the files are filtered based on the provided mask using the Like functionality I described in a previous blog post. (Interestingly, I wrote about it twice- in the linked post, as well as here. It’s pretty obvious you’ve written a lot of content when you find you wrote about things twice without realizing it.)

In my mind, the idea behind the class is to be a “lower level” class that will be consumed by a more advanced Search class, with the advanced search class supporting things like Advanced Filters and perhaps Actions that can be taken on each result. The more “low-level” class would expose two events. An event fired when a search completes, and an event fired when an item is found.

The “old” class I discussed in the previous post used a design that managed state via fields in the class. In my redesign I’ll “cheat” and make sure the actual search function itself is synchronous, and provide asynchronous behaviour via a wrapper. I should mention that I actually use the ‘Thread’ model, rather than an async routine. I admit this is because I’ve yet to fully grok async. The fact that my work requires .NET Framework 4 and thus we cannot use the async stuff (meaning I don’t get any experience with it through my work) doesn’t help. On the other hand- this does mean this code will work in older versions of C#. At any rate, since we will be dealing with events, I suppose we ought to define them first:

The Completion event will contain a property to indicate “how” it completed. Alternatively, it could use distinct events, or even use a event heirarchy (eg AsyncFileFindCompletion event being an abstract base class with a “AsyncFileFindCancelled” subclass and a “AsyncFileFindFinished” subclass.). Cancelling is one consideration that is pretty important- we want to be able to cancel the search at any time by calling a Cancel method. Calling the Cancel method should guarantee that, after you call it, you will not receive any events from files being found from that search.

Similarly, since this will be asynchronous and the caller will be given control right away, we’ll want to have some sort of re-entrancy protection. the Start method will detect if a search is already in progress, and, if so, it should raise this exception:

And, the main class itself:

OK so it ended up a bit longer than I originally expected. The Debug statements use the Debug class I discussed Previously. I’ve plonked the whole thing into a new FileIterator repository. This was totally not because I couldn’t figure out how to change the existing AsyncFileIterator repository to accept my changes, that was just a coincidence.

Now that it actually works, I’ll be able to properly revise it to take advantage of C# 5.0 and CD 6.0 features and discuss those improvements to the project in a later post.

Have something to say about this post? Comment!