Core Data – efficient fetching of portions of Entities

February 14, 2014 by

Quick write up of an active conversation on twitter and the results of some research.

Brent Simmons posted about something I’d started looking into today: Efficient Core Data fetching when you only need a few fields of a database Entity.  The name of setPropertiesToFetch: in NSFetchRequest looks promising however the documentation says:

This value is only used if resultType is set to NSDictionaryResultType.

Well that’s a bummer because we’d really like our nice subclass of NSManagedObject to be used by our view controller (say) or we only need to access one property of the complex object.  Here’s something odd I noticed though: the comment in the NSFetchRequest.h file (OS X 10.8 SDK & 10.9 SDK) for setPropertiesToFetch: says this:

If NSManagedObjectResultType is set, then NSExpressionDescription cannot be used, and the results are managed object faults partially pre-populated with the named properties

Well hey! That’s exactly what we want it to do for efficiency when we only need a couple of properties out of many for a given entity.

Thanks to some twitter dialog with @pgor and others, and especially a reminder from @JimRoepcke about setting -com.apple.CoreData.SQLDebug 1 to show the sql logging for Core Data when talking to an sqlite backend, I was able to verify that this does work the way we want it to, at least on OS X running under 10.8.4:

If you use code like the following:


NSFetchRequest * request = [NSFetchRequest fetchRequestWithEntityName: @"AppEntry"];
[request setPredicate: [NSPredicate predicateWithFormat: @"app_id in %@", inAppIDs]];
[request setIncludesSubentities: NO];
[request setPropertiesToFetch: @[ @"app_id", @"fooprop"]];
// [request setIncludesPropertyValues: NO]; // you need YES, which is the default
[request setResultType: NSManagedObjectResultType];
[request setReturnsObjectsAsFaults: NO]; // maybe not needed.  It will still be a fault,
                                         // but the properties are preloaded
NSError * error = nil;
NSArray * fetchedArray = [self.managedObjectContext executeFetchRequest: request error: &error ];
if ( fetchedArray != nil )
{
  if ( [fetchedArray count] > 0 )
  {
    AppEntry * entry = [fetchedArray firstObject];
    NSLog(@" isFault? %@", entry.isFault ? @"YES" : @"NO" );
    NSNumber * appID = entry.app_id;
    NSLog(@" appID: %@", appID );
    NSLog(@" isFault? %@", entry.isFault ? @"YES" : @"NO" );
      // accessing property not in setPropertiesToFetch:
      // causes fault to fire & load whole object
    NSString * otherProperty = entry.otherProperty;
    NSLog(@" isFault? %@", entry.isFault ? @"YES" : @"NO" );
    NSLog(@"otherProperty: %@", otherProperty );
  }
  else
    NSLog(@"no resuts");
}

(code changed to mask client project details so typos are because of that)

and the output is:

// the fetch does this - only the two properties we indicated are fetched:
 CoreData: sql: SELECT t0.Z_ENT, t0.Z_PK, t0.ZAPP_ID, t0.ZFOO FROM 
                  ZAPPENTRY t0 WHERE t0.ZAPP_ID IN (?,?,?)
 isFault? YES
 appID: 281940292
 isFault? YES
 CoreData: sql: SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZAPP_ID, t0.ZOTHERPROPERTY, 
                   t0.ZDATESTAMP, t0.ZNAME, t0.ZFOO, t0.ZTITLE FROM ZAPPENTRY t0 
                   WHERE t0.Z_PK = ?
 CoreData: annotation: sql connection fetch time: 0.0025s
 CoreData: annotation: total fetch execution time: 0.0030s for 1 rows.
 CoreData: annotation: fault fulfilled from database for : 0x100150590 ...
 isFault? NO
 otherProperty: it works!

So you can see that the main fetch request only grabbed the two properties that were in setPropertiesToFetch:. The result object is a faulted NSManagedObject (see where we checked isFault) but the property values for the properties listed in setPropertiesToFetch: are available without faulting the object and doing another database fetch. You’ll see we accessed appID to check this.

However, if you then access a property that was not in the setPropertiesToFetch: list (otherProperty in the above code) you can see that another sql call is made and the object is faulted and fully loaded.  The next check for isFault returns NO because it is no longer a fault.

So to me this indicates that setPropertiesToFetch: IS useful exactly as we’d like even when you’re not getting the NSDictionaryResultType and so is ideal for exactly the situation Brent was asking about for his timeline view.    I’ve filed a radar on the documentation error rdar://16073227 so hopefully that’ll get updated sooner than later.

Update:

I was curious if one could modify one of these limited properties without faulting the entire object and all of it’s properties (which seemed unlikely) and so I did an additional test and the answer is: nope.  Modifying a prefetched property on the object causes a fault and the whole object and all it’s properties are loaded into memory and isFault returns NO.

Update #2: 

So this technique above turns out to work and be handy when you only need to fetch an item in a one-to-many property/relation off of an object with lots of properties to avoid loading all those properties into memory.  In our sample data an AppEntry has a bunch of properties including a relation to a list of prices and we’d like the most recent price (just any price for this sample code) but we don’t need to load in the rest of the data in the App Entry entity.  Works like this:

NSFetchRequest * request = [NSFetchRequest fetchRequestWithEntityName: @"AppEntry"];
[request setPredicate: [NSPredicate predicateWithFormat: @"app_id == %@", inAppID]];
[request setIncludesSubentities: NO];
[request setReturnsObjectsAsFaults: NO];
[request setPropertiesToFetch: @[ @"app_id" ]];
[request setRelationshipKeyPathsForPrefetching: @[ @"prices" ]];
NSError * error = nil;
NSArray * fetchedArray = [self.managedObjectContext executeFetchRequest: request error: &error ];
if ( fetchedArray != nil )
{
  if ( [fetchedArray count] > 0 )
  {
    AppEntry * appData = [fetchedArray firstObject];
    if ( appData != nil )
    {
      PriceEntry * price = [appData.prices anyObject];
      NSLog(@"a price is: %@", price.price );
    }
  }
}

and the logging looks like this:

CoreData: sql: SELECT t0.Z_ENT, t0.Z_PK, t0.ZAPP_ID FROM ZAPPENTRY t0 WHERE t0.ZAPP_ID = ?
CoreData: annotation: sql connection fetch time: 0.0005s
CoreData: sql: SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZDATESTAMP, t0.ZPRICE, t0.ZAPP_ID
   FROM ZPRICEENTRY t0 WHERE t0.ZAPP_ID IN (?) ORDER BY t0.ZAPP_ID
CoreData: annotation: sql connection fetch time: 0.0004s
CoreData: annotation: total fetch execution time: 0.0007s for 1 rows.
CoreData: annotation: Prefetching with key 'prices'. Got 1 rows.
CoreData: annotation: total fetch execution time: 0.0022s for 1 rows.
a price is: 0

So you can see that it only loads the app_id property from the AppEntry object but still you can get to the prices relation.  Unfortunately I’m not seeing a way to cascade the setPropertiesToFetch: down into the related entities and so all the fields in the PriceEntity in this example get fetched.  I’d love to find a way to do that so if you figure that out please let me know – thanks!

Update #3: 

(I know, I know! What’s with all these updates?! – I wanted to get the basics out there and then I kept thinking of other angles/aspects of this topic).

So one thing that might be helpful is to show how to use the -com.apple.CoreData.SQLDebug 1 setting to examine what Core Data is doing when you’re using it. Setting this was critical to understanding what was happening in the above research.  You want to pass this as an argument to your app. So if you’re launching your app from the terminal you’d pass -com.apple.CoreData.SQLDebug 1 as a command line argument as usual. If you’re running from within Xcode then you set the arguments to pass to your app in the Scheme settings which you access via Edit Scheme… or Manage Schemes… + Edit Scheme….  Here’s a screenshot of the Edit Scheme sheet where you can see how to enter this flag:

Set Command Line Agrument

A Legend Awakens

February 10, 2014 by

“A Legend Awakens” in the Tengwar script written using a Fraktur black-letter calligraphic style, by Geek. A snow day project.

IMG_3100_web

 

 

NSURLConnection & GDC

January 9, 2014 by

Using NSURLConnection combined with GCD for the first time and noticed that if you invoke NSURLConnection on anything other than the main queue it seems to lock up and nothing happens.  Actually something happens but your delegate never gets called.  Easy fix though; you just need to set the delegate queue (or run loop) before starting the download.

 

Here’s some sample code setting the delegate queue for you to play with if you like. Just make a sample cocoa app from the standard template and then setup the application delegate like this


@interface AppDelegate ()
  @property (strong) NSURLConnection* connection;
  @property (strong) NSMutableData * downloadedData;
@end

@implementation AppDelegate

- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
  // Insert code here to initialize your application

  dispatch_async( dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0), 
   ^{  
      self.downloadedData = [NSMutableData new];
      NSURLRequest * request = [NSURLRequest requestWithURL: 
      [NSURL URLWithString: @"http://wp.me/av6Bm-pZ"]];
      self.connection = [[NSURLConnection alloc] initWithRequest: request 
                          delegate: self 
                        startImmediately: NO];  // This *must* be "NO". 
                                                // Can't switch delegate queue 
                                                // after download starts
      [self.connection setDelegateQueue: [NSOperationQueue mainQueue]];  // *** set queue
      [self.connection start];   // *now* you can start the fetch
      NSLog(@"request queued!");
   });
}

// delegate calls just so let us know when it's working or when it isn't

- (void)connection:(NSURLConnection *)connection didFailWithError:(NSError *)error
{
    NSLog( @"download failed with an error: %@, %@", 
        error, [error description] );

    // release this stuff, test is done.
    self.connection = nil;
    self.downloadedData = nil;
}

- (void)connection:(NSURLConnection *)connection didReceiveResponse:(NSURLResponse *)response
{
    NSLog( @"connection did receive response: %@", response );
}

- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data
{
    NSLog(@"download data received %lu", [data length] );
    [self.downloadedData appendData: data];
}

- (void)connectionDidFinishLoading:(NSURLConnection *)connection
{
    NSLog(@"didFinishLoading.  total data is: %lu", [self.downloadedData length] );

    // release this stuff, test is done.
    self.connection = nil;
    self.downloadedData = nil;
}

@end

Line 22 is the key (the setDelegateQueue: call). If you leave that line out then you’ll wonder why it’s not downloading your data. If you change the queue passed to line 12 with dispatch_get_main_queue() then it’ll work without the setDelegateQueue: call).

Obvious once you see it but maybe this’ll save someone a bit of time.

P.S. This was done with Xcode 5 under 10.8.5 fwiw.

10 year old Geek explains why he does what his parents ask…

January 3, 2014 by

Posted this on Twitter but then realized that goes away and this is funny enough I wanted to keep it around.

IMG_2935 name obscured web

 

 

In case his 10-year-old writing is hard to read,

“I have to do what my parents tell me because they are insane and their doctor told me to humor them.”

:-P

None of us remembered this but it was found during our move and gave us all a laugh.  Creative rationalization.

 

Something I wrote back in March when trying out Draft

November 21, 2013 by

Late one night back in March I tried out a cloud-based editing platform that Nathan Kontny is putting together called Draft.  It’s designed to save various drafts of your writing and to facilitate getting feedback from others as you write.  I sat down and wrote one short thing and then didn’t get back to it (we packed up our farm and moved so Geek could go to college but still live at home (he’s 16) and I didn’t have time).

Today Nathan sent an email about some neat new features which prompted me to go take another look.  WOW!  He’s added a ton of interesting capabilities.  Anyway, I realized I hadn’t published the thing I wrote and while it started out as a “blank page! Yikes!” it did document some of what was happening then and some of my thoughts about the 1% (big in the news at the time) and the challenging economic times.  Since Draft doesn’t have a feature to host finished work (turns out it does: Draft SitesI thought I’d put it here so I don’t forget it:

So, it begins. Once again all thought stops when faced with a blank page. An open window through which only air and the sounds carried upon it pass. And light. And smells. Fresh spring smells and sounds.

Where exactly to begin? Do we start with the falling apart of the social fabric? Or maybe the seeming inability of a large percentage of the super wealthy to see that sucking all the money out of an economy is like sucking all the air out of a sealed space. A fatal result for those on the inside in both cases. Even good people get desperate when they can’t breath. Or don’t have enough food to feed their children.

Sent money today to a friend who’s regular weekly music gig got cancelled for good after twice being cancelled “just this weekend.” As a result he doesn’t have enough money for food despite living on bags of potatoes from CostCo which he adorns with left-over condiments from his friend who cleans vacation rentals. Said he saw a rat in his kitchen the other night and wonders if country rats carry much disease. “As a buddhist I don’t want to kill it but I’m a little scared it might carry disease and I can’t afford a visit to a doctor…” he told me.

Time to sleep so I can earn more money tomorrow.  Might get a call from another friend in need.

—-

By the way, Nathan has a lot of great stuff on his blog about all kinds of things.  I especially liked:

Why Science Fiction?

October 16, 2013 by

Neil Gaimon on Science Fiction:

I was in China in 2007, at the first party-approved science fiction and fantasy convention in Chinese history. And at one point I took a top official aside and asked him Why? SF had been disapproved of for a long time. What had changed?

It’s simple, he told me. The Chinese were brilliant at making things if other people brought them the plans. But they did not innovate and they did not invent. They did not imagine. So they sent a delegation to the US, to Apple, to Microsoft, to Google, and they asked the people there who were inventing the future about themselves. And they found that all of them had read science fiction when they were boys or girls.

So, maybe encourage a little Science Fiction in your child’s menu of reading material.

I probably don’t have to say this, but introduce your daughters to science fiction also. There are some great female science fiction authors  and there’s another list here.  Dad finds that female authors seem to be a lot more likely to include strong & smart female characters that have depth and are thus more interesting.  Peace.

XM3RPG in Public Beta

August 9, 2013 by

I’m pleased to announce that XM3RPG, the pen-and-paper role playing game I’ve been working on, is ready for its first public beta! Check it out here, download the rules, try it out, and please send me feedback! This is still an early beta, so there are almost certainly issues remaining – tell me what they are so I can fix them. Thanks & Enjoy!

Sorry about the crickets…

May 5, 2013 by

…but Geek decided to skip 12th grade and go to Reed College next year!  That sudden acceleration of him ending high school and starting college has caused a lot of scurrying around as we figure out logistics.  We currently plan on selling our 21 acre homestead and farm and moving into Portland near Reed so Geek can live at home the first year (he’s only 16).  As a result, we’ve been too busy to do much here on the blog.

We’ll be back!

Basic ObjC /C++ performance ala Mike Ash

March 14, 2013 by

Rich @siegel asked on twitter:

and @danielpunkass pointed to a @mikeash blog post with some tests:

I was curious about this so I added that test and a test of the new @autoreleasepool {} construct. The modified code is here (@mikeash feel free to grab it and use it on your site) and the results of running this on my older MacBook Air 1.8GHz Core i7 with 4GB 1333MHz DDR3 RAM and SSD (other apps running in the background) are as follows:

Name Iterations Total time (sec) Time per (ns)
IMP-cached message send 1000000000 0.2 0.2
C++ virtual method call 1000000000 0.4 0.4
Block invocation 1000000000 0.5 0.5
Integer division 1000000000 2.3 2.3
Objective-C message send 1000000000 2.7 2.7
16 byte memcpy 100000000 0.4 4.4
Floating-point division 100000000 0.6 6.0
Float division with int conversion 100000000 0.6 6.3
New Autorelease construct 10000000 0.2 17.1
NSAutoreleasePool alloc/init/release 10000000 0.6 63.5
16 byte malloc/free 100000000 6.5 64.7
NSInvocation message send 10000000 0.9 91.3
NSObject alloc/init/release 10000000 1.6 165.0
16MB malloc/free 100000 0.1 880.0
NSButtonCell creation 1000000 3.2 3201.7
Read 16-byte file 100000 0.8 8250.8
pthread create/join 10000 0.2 23009.4
Zero-second delayed perform 100000 2.8 28038.3
NSButtonCell draw 100000 4.7 47448.6
1MB memcpy 10000 0.8 78902.2
Write 16-byte file 10000 2.1 209280.7
Write 16-byte file (atomic) 10000 3.9 392459.6
Read 16MB file 100 0.8 8050127.5
NSTask process spawn 1000 67.3 67305116.9
Write 16MB file (atomic) 30 2.1 70342873.5
Write 16MB file 30 2.2 72666382.1

Built it with:

cc -v -lstdc++ -framework Cocoa perf.mm

and the dev tools are:
Apple LLVM version 4.2 (clang-425.0.24) (based on LLVM 3.2svn)
Target: x86_64-apple-darwin12.2.0
Thread model: posix

Xcode 4.6 basically.

If Mike reads this: feel free to grab any of this you want and copy it to your site. Just put it here because that was easiest.

Not sure my tests are exactly right so feel free to make suggestions etc.  One thing I didn’t test but might add another test for is how handling local variables used in the block impacts things; likewise local __block variables in the block, but the whole point of this test suite seemed to be keeping things as simple as possible so I stuck with that.

UPDATE:

@schwa asked about iOS and since I had the day off I went ahead and did those as well, though I had to remove some tests I didn’t feel like fixing for iOS (zero delay perform selector messes with the RunLoop which caused crashes, no NSButton class, NSTask (?), and the file I/O tests weren’t going to work as is).  Started with the single view iOS Application Xcode template and added a text view to jam the results into.  This was run on an iPhone 5 running iOS 6.0.2:

Name Iterations Total time (sec) Time per (ns)
C++ virtual method call 1000000000 0.8 0.8
Block invocation 1000000000 2.8 2.8
IMP-cached message send 1000000000 4.6 4.6
Objective-C message send 1000000000 14.7 14.7
Float division with int conversion 100000000 1.8 17.8
Floating-point division 100000000 1.8 17.8
Integer division 1000000000 21.6 21.6
16 byte memcpy 100000000 6.1 61.0
New Autorelease construct 10000000 0.8 83.4
16 byte malloc/free 100000000 46.8 467.5
NSAutoreleasePool alloc/init/release 10000000 5.2 519.2
NSObject alloc/init/release 10000000 12.6 1259.0
NSInvocation message send 10000000 13.5 1346.3
16MB malloc/free 100000 1.5 15374.9
pthread create/join 10000 1.2 116415.9
1MB memcpy 10000 3.9 394341.5

Displaying progress during long operations – some thoughts. (UPDATED)

March 1, 2013 by

Humans generally hate to wait.  So as software developers we try to make everything as fast as possible.  Sometimes things just take time.  Like when you have to do 38,000 of something moderately slow.

Finite progress reporting (by which I mean “X of Y” or “X%”) is generally better than indeterminate reporting (a spinner which shows only that something is happening, if just the spinner spinning) from a user’s perspective.

Experienced programmers will have analyzed the performance on their application and realized that in some cases the updating of the progress count UI is actually taking a significant percentage of the total time to do the operation.  The usual “fix” is to only update the count after processing a chunk of items – say 10, 50, or 100 items – instead of after each one.  This amortizes the cost of updating the screen across more than one item and reduces the total number of updates which thus makes updating the UI a smaller percentage of the total time for the operation.  I often see people use round numbers like 10, 50 or 100 in a modulo test like this:


if (done % 100 == 0 )
{
    [self updateProgressDone: done of: total];
}

Here’s the thing I just noticed: as a user it doesn’t feel as fast or good to see something count up by hundreds, or even fifties.  I decided that it was because so many of the digits aren’t changing for such large percentage of the time I watched.  So I tried changing the modulo to 99 so that all the digits change and sure enough: it feels faster.  Fascinating!

I then tried something else that I kind of like too:  take the performance hit on the last few and display them all at the end like this:


if (done % 99 == 0 || done + 100 > total  )
{
    [self updateProgressDone: done of: total];
}

So the UI will visually count up by 99 at a time until the last 100 and then it’ll count up by ones.  If that’s too slow you could count by fives at the end like this:


if (done % 99 == 0 || (done + 100 > total && done % 5 == 0) )
{
    [self updateProgressDone: done of: total];
}

Anyway, something I was playing with today.  I’m sure there’s a lot more interesting psychology involved with this (I remember reading about a progress bar that speeds up it’s reporting over time), but this is an interesting simple change that really made a difference in some UI I just wrote.  I’d love to do actual research on this but alas I have software to write and ship :-)

2013.03.17 UPDATE:

So after using this on various machines with varying other loads on the machine I decided that anything which isn’t time based is doomed to failure because what matters is how often the user perceives an update of status happening.  So I’ve switched to the following:


static dispatch_time_t lastUpdate = 0;
dispatch_time_t now = dispatch_time( DISPATCH_TIME_NOW, 0);

if ( now - lastUpdate > 250 * NSEC_PER_MSEC )
{
  lastUpdate = now;
  [self updateProgressDone: done of: total];
}

and am updating every quarter second (250ms).  I may play with putting the ramping up of updates in there (starting with say 350ms and going to 200ms towards the end, or something like that), but this actually seems good enough now and handles the case where the app is running on a slow machine and spotlight is dogging the entire machine etc.


Follow

Get every new post delivered to your Inbox.

Join 1,166 other followers