HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Concurrently enumerating an array using blocks in a thread-safe way

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
blocksarrayenumeratingwaythreadusingsafeconcurrently

Problem

I have an array that I want to enumerate using blocks concurrently. However, I'm having trouble making this thread safe. I am new to using blocks and locks, so I am hoping someone may be able to push me in the right direction for preventing this from crashing.

The point of this function is to loop over a number of files and folders.

  • if folder, create a new dictionary item



  • if file, add as child to folder key



  • if folder, then recursively move into the folder to iterate over all files and folders and add to dictionary



This builds a dictionary structure of the file system. However, it is slow and I would like to do this concurrently.

My main function looks like this:

- (void)createDirectoryStructure:(NSString *)LR withArray:(NSMutableArray *)myArray {

    __block NSFileManager *fm = [NSFileManager defaultManager];
    __block BOOL isDir=NO;
    __block NSString *local;
    __block NSString *myKey;

    [myArray enumerateObjectsWithOptions:NSEnumerationConcurrent usingBlock:^(id obj, NSUInteger idx, BOOL *stop) {

        NSString *url=obj;
        if ([LR isEqualToString:@"L"]) {
            myKey = [url stringByDeletingLastPathComponent];
            local = [self.rootdirL.path stringByAppendingString:url];
            [fm fileExistsAtPath:local isDirectory:&isDir];
            if (!isDir)
                [self updateStructureWithKey:myKey andURL:url isDir:isDir forLR:LR];
        }
    }];
}


This function calls updateStructureWithKey that looks like this:

```
-(void) updateStructureWithKey:(NSString)myKey andURL:(NSString)url isDir:(BOOL)isDir forLR:(NSString*)LR
{
NSArray *components=[myKey pathComponents];
NSString *addPath=@"";
NSUInteger counter=0;

for (NSString *component in components){

NSString *createDir=[addPath stringByAppendingPathComponent:component];
addPath=createDir;

counter+=1;
if ((unsigned long)counter<(unsigned long)components.count){
NSString *addchild = [cr

Solution

-
The bottleneck of your code might not be where you think it is. I recommend reading the Performance Guidelines of Apple as well as the specific File-System Performance Guidelines.

-
Typically the bottleneck is at accessing the drive. So in making your build-up of Dictionaries concurrent, you will not gain anything as there is only one drive. You are checking with NSFileManager on each item with fileExistsAtPath which might be the bottleneck. Try getting this information initially when building myArray. You're probably doing this also with Directory Enumerator. There are options to get specific metadata directly for an URL (like if it's a directory or folder) and this then will be cached in the NSURL (instead of working with path strings).

-
Did you try to set some Breakpoints to find more details about EXC_BAD_ACCESS? Is it because of objects that were released too early? Or is it because of mutated while enumerating? Set a breakpoint on "All Exceptions" in Xcode and run the code with debugger. You will then be able to find more details on the crash.

-
To isolate the crash, try to make the code in your block smaller. Remove all the code not really needed like this whole LR thing.

I've analyzed the provided code which is not recursive. See my GitHub for the edits. Here are my findings (for a test directory with 27’861 nested items):

-
Most time was spent in the enumeration getting Filesystem metadata:

Time     Self    Symbol Name
3568.0ms  53.6%   -[SDAppDelegate createDirectoryStructure]
2680.0ms  40.2%   -[NSURLDirectoryEnumerator nextObject]


The new code only fetches as much metadata as needed and reuses it by using NSURL.

-
The code for filling the array also did lots of duplicate checks:

Time      Self    Symbol Name
2997.0ms   45.0%    -[SDAppDelegate createArraysForLocalDirectories]
2109.0ms   31.6%    -[SDAppDelegate addDictionaryItem:withURL:isDir:]
1121.0ms   16.8%    -[NSArray containsObject:]


The new code does it a bit simpler. It could be even simpler, see comment in code.

-
Another thing was Memory management. I removed the nested autoreleasepool, it's not really needed. See "Use Local Autorelease Pool Blocks to Reduce Peak Memory Footprint"

-
As for concurrency, you were checking the mutable self.dict always if some key exists. If you want to write to this dict, you have to synchronize the access to it with a lock. The simplest one is @synchronized().

-
The improved code runs on the very same directory structure with the following times:

Time     Self    Symbol Name
735.0ms    6.6%    -[SDAppDelegate createDirectoryStructure]
268.0ms    2.4%    -[SDAppDelegate createArraysForLocalDirectories]


-
You could make a recursive code by using making a method that uses NSDirectoryEnumerationSkipsSubdirectoryDescendants and then call this method inside it again for directories. But that probably doesn't really speed it up.

Code Snippets

Time     Self    Symbol Name
3568.0ms  53.6%   -[SDAppDelegate createDirectoryStructure]
2680.0ms  40.2%   -[NSURLDirectoryEnumerator nextObject]
Time      Self    Symbol Name
2997.0ms   45.0%    -[SDAppDelegate createArraysForLocalDirectories]
2109.0ms   31.6%    -[SDAppDelegate addDictionaryItem:withURL:isDir:]
1121.0ms   16.8%    -[NSArray containsObject:]
Time     Self    Symbol Name
735.0ms    6.6%    -[SDAppDelegate createDirectoryStructure]
268.0ms    2.4%    -[SDAppDelegate createArraysForLocalDirectories]

Context

StackExchange Code Review Q#39435, answer score: 3

Revisions (0)

No revisions yet.