patternMinor
Concurrently enumerating an array using blocks in a thread-safe way
Viewed 0 times
blocksarrayenumeratingwaythreadusingsafeconcurrently
Problem
I have an array that I want to enumerate using blocks concurrently. However, I'm having trouble making this thread safe. I am new to using blocks and locks, so I am hoping someone may be able to push me in the right direction for preventing this from crashing.
The point of this function is to loop over a number of files and folders.
This builds a dictionary structure of the file system. However, it is slow and I would like to do this concurrently.
My main function looks like this:
This function calls
```
-(void) updateStructureWithKey:(NSString)myKey andURL:(NSString)url isDir:(BOOL)isDir forLR:(NSString*)LR
{
NSArray *components=[myKey pathComponents];
NSString *addPath=@"";
NSUInteger counter=0;
for (NSString *component in components){
NSString *createDir=[addPath stringByAppendingPathComponent:component];
addPath=createDir;
counter+=1;
if ((unsigned long)counter<(unsigned long)components.count){
NSString *addchild = [cr
The point of this function is to loop over a number of files and folders.
- if folder, create a new dictionary item
- if file, add as child to folder key
- if folder, then recursively move into the folder to iterate over all files and folders and add to dictionary
This builds a dictionary structure of the file system. However, it is slow and I would like to do this concurrently.
My main function looks like this:
- (void)createDirectoryStructure:(NSString *)LR withArray:(NSMutableArray *)myArray {
__block NSFileManager *fm = [NSFileManager defaultManager];
__block BOOL isDir=NO;
__block NSString *local;
__block NSString *myKey;
[myArray enumerateObjectsWithOptions:NSEnumerationConcurrent usingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
NSString *url=obj;
if ([LR isEqualToString:@"L"]) {
myKey = [url stringByDeletingLastPathComponent];
local = [self.rootdirL.path stringByAppendingString:url];
[fm fileExistsAtPath:local isDirectory:&isDir];
if (!isDir)
[self updateStructureWithKey:myKey andURL:url isDir:isDir forLR:LR];
}
}];
}This function calls
updateStructureWithKey that looks like this:```
-(void) updateStructureWithKey:(NSString)myKey andURL:(NSString)url isDir:(BOOL)isDir forLR:(NSString*)LR
{
NSArray *components=[myKey pathComponents];
NSString *addPath=@"";
NSUInteger counter=0;
for (NSString *component in components){
NSString *createDir=[addPath stringByAppendingPathComponent:component];
addPath=createDir;
counter+=1;
if ((unsigned long)counter<(unsigned long)components.count){
NSString *addchild = [cr
Solution
-
The bottleneck of your code might not be where you think it is. I recommend reading the Performance Guidelines of Apple as well as the specific File-System Performance Guidelines.
-
Typically the bottleneck is at accessing the drive. So in making your build-up of Dictionaries concurrent, you will not gain anything as there is only one drive. You are checking with NSFileManager on each item with
-
Did you try to set some Breakpoints to find more details about
-
To isolate the crash, try to make the code in your block smaller. Remove all the code not really needed like this whole
I've analyzed the provided code which is not recursive. See my GitHub for the edits. Here are my findings (for a test directory with 27’861 nested items):
-
Most time was spent in the enumeration getting Filesystem metadata:
The new code only fetches as much metadata as needed and reuses it by using NSURL.
-
The code for filling the array also did lots of duplicate checks:
The new code does it a bit simpler. It could be even simpler, see comment in code.
-
Another thing was Memory management. I removed the nested
-
As for concurrency, you were checking the mutable
-
The improved code runs on the very same directory structure with the following times:
-
You could make a recursive code by using making a method that uses
The bottleneck of your code might not be where you think it is. I recommend reading the Performance Guidelines of Apple as well as the specific File-System Performance Guidelines.
-
Typically the bottleneck is at accessing the drive. So in making your build-up of Dictionaries concurrent, you will not gain anything as there is only one drive. You are checking with NSFileManager on each item with
fileExistsAtPath which might be the bottleneck. Try getting this information initially when building myArray. You're probably doing this also with Directory Enumerator. There are options to get specific metadata directly for an URL (like if it's a directory or folder) and this then will be cached in the NSURL (instead of working with path strings). -
Did you try to set some Breakpoints to find more details about
EXC_BAD_ACCESS? Is it because of objects that were released too early? Or is it because of mutated while enumerating? Set a breakpoint on "All Exceptions" in Xcode and run the code with debugger. You will then be able to find more details on the crash. -
To isolate the crash, try to make the code in your block smaller. Remove all the code not really needed like this whole
LR thing. I've analyzed the provided code which is not recursive. See my GitHub for the edits. Here are my findings (for a test directory with 27’861 nested items):
-
Most time was spent in the enumeration getting Filesystem metadata:
Time Self Symbol Name
3568.0ms 53.6% -[SDAppDelegate createDirectoryStructure]
2680.0ms 40.2% -[NSURLDirectoryEnumerator nextObject]The new code only fetches as much metadata as needed and reuses it by using NSURL.
-
The code for filling the array also did lots of duplicate checks:
Time Self Symbol Name
2997.0ms 45.0% -[SDAppDelegate createArraysForLocalDirectories]
2109.0ms 31.6% -[SDAppDelegate addDictionaryItem:withURL:isDir:]
1121.0ms 16.8% -[NSArray containsObject:]The new code does it a bit simpler. It could be even simpler, see comment in code.
-
Another thing was Memory management. I removed the nested
autoreleasepool, it's not really needed. See "Use Local Autorelease Pool Blocks to Reduce Peak Memory Footprint"-
As for concurrency, you were checking the mutable
self.dict always if some key exists. If you want to write to this dict, you have to synchronize the access to it with a lock. The simplest one is @synchronized().-
The improved code runs on the very same directory structure with the following times:
Time Self Symbol Name
735.0ms 6.6% -[SDAppDelegate createDirectoryStructure]
268.0ms 2.4% -[SDAppDelegate createArraysForLocalDirectories]-
You could make a recursive code by using making a method that uses
NSDirectoryEnumerationSkipsSubdirectoryDescendants and then call this method inside it again for directories. But that probably doesn't really speed it up.Code Snippets
Time Self Symbol Name
3568.0ms 53.6% -[SDAppDelegate createDirectoryStructure]
2680.0ms 40.2% -[NSURLDirectoryEnumerator nextObject]Time Self Symbol Name
2997.0ms 45.0% -[SDAppDelegate createArraysForLocalDirectories]
2109.0ms 31.6% -[SDAppDelegate addDictionaryItem:withURL:isDir:]
1121.0ms 16.8% -[NSArray containsObject:]Time Self Symbol Name
735.0ms 6.6% -[SDAppDelegate createDirectoryStructure]
268.0ms 2.4% -[SDAppDelegate createArraysForLocalDirectories]Context
StackExchange Code Review Q#39435, answer score: 3
Revisions (0)
No revisions yet.