Guest Post by @bizzybarney! A Peek Inside the PPSQLDatabase.db Personalization Portrait Database

The DFIR Twitter-sphere exploded this morning when @mattiaep mentioned /private/var/mobile/Library/PersonalizationPortrait/PPSQLDatabase.db. I’ve been doing some research work on this file and plan to present pieces of it during my talk at the upcoming SANS DFIR Summit. I reached out to @iamevltwin and asked if she would host a quick blog post and she graciously agreed - but I now owe her gin, and steak and cheese egg rolls. In all seriousness, a huge thank you to my good friend and mentor!

Check out the upcoming Summit agenda here.

The PersonalizationPortrait directory is native on iOS 13 and contains a few interesting files. One specifically, PPSQLDatabase.db, is loaded with data. From my research, the directory existed on iOS 12 but this database did not.  Some of the data is repetitive from other native locations, but there’s a lot of context that can be correlated from this database and pieces of this data might exist longer here than they do in other places. So what is the purpose of this database? Don’t forget as you poke at it, that it has ‘Personalization’ in its title. It resides in the native /Library directory, so it’s Apple. What are you up to Apple? 

My research on this is still ongoing, but one thing I know is Apple wants me to like my device and to feel comfortable while I’m constantly immersed in their operating system. When I open News I want to see certain things of interest. When I select a photo to send to a friend Apple gives me a short list of probable recipients. When I plug my car into CarPlay I appreciate when Apple weirdly predicts where I’m headed. Is this ‘Personalization’ just for Apple or are pieces of this filtering out to other parties for better advertisement tailoring? The Widgets screen to the left of my home screen on my iPhone is a dynamic representation of so many pieces of personal data. What’s next on my Calendar and where it’s occurring, which apps I recently used, News, Weather based on my location, where I last parked my car when using CarPlay, and ScreenTime totals I can’t hardly believe. 

I want to highlight a few pieces of the PPSQLDatabase.db file to show it’s potential, and pass along an appropriate amount of caution as well. This database is aggregating data from many sources and attribution must be done carefully.

The first table to peek into is loc_records. It is absolutely loaded with location data - mine had 1000 entries. The 1000 is curious, and I wonder if anyone else’s table is limited at 1000 or if mine just landed on that nice even number. @iamevltwin did a quick check and found 1000 as well on Mac and iOS, so it seems that is the limit for the table. In reviewing just the locations from this table without joining any other table data, I quickly recognize many of them as places I have definitely been with this device in my pocket. But…then there are those locations I definitely haven’t been, ever. So while having 1000 locations records to play with sounds like good fun, don’t forget to be diligent with attribution.

Most of these location records have a column named “clp_location” which is stored as BLOB data. At closer inspection the BLOB data is a binary plist (.bplist). 

Using plutil -p produces a more friendly output format to inspect, reminiscent of Cloud-V2.sqlite and other “Significant Location” files. But that doesn’t mean my device was at that place at that time, because I wasn’t in Las Vegas on June 2nd, 2020 which is what that timestamp converts to. I checked my photos based on the ‘Places’ feature, and I did take a photo at that location - but it was June 4th, 2019 at 5:35PM. Weird.

So why is a photo from a year ago popping up? Making a few joins and gathering a bit more data produces more context to this specific item. After hammering out a query to join it over to the ‘sources’ table, I am able to see that this entry is associated with ‘com.apple.mobileslideshow’ and more specifically ‘com.apple.proactive.PersonalizationPortrait.PhotosGraphDonation’. One word really jumps out there - ‘proactive.’ That makes me think Apple is doing something for me here, and without asking. I went to my Photos section and found a categorization of Las Vegas photos for June 2-7 in my ‘Months’ category, and the photo is in that album. As I scanned the other places in these ‘proactive’ listings I am also able to see photo groupings from a trip to Jamaica (Kingston) and to Huntington Beach, CA in their own categories. 

So you may be asking yourself at this point, “Why did I just read this? It’s a location in a photo.” Stick with me, and remember we are trying to figure out what this newly discovered file is doing and how far reaching it might be. 

A few lines down from the photos we have an address in New Jersey being attributed to com.apple.mobilemail. I tweaked the query to ‘localtime’ for the sake of presentation, but I received an email on May 31, 2020 at 11:17AM EDT regarding the June 2nd release of Cellebrite Physical Analyzer version 7.34. In the signature of that email was the US headquarters address for Cellebrite - 7 Campus Drive, Suite 210, Parsippany, NJ. 

This is just scratching the surface of a few items from one table with basic table joins made. Scanning other bundles represented here I can see some of my favorite places from Apple Maps, locations recently mentioned in iMessage conversations, and Calendar records of my standing 12:30PM Zoom meeting for Life has No Ctrl+Alt+Del. 

If you don’t have time to research but would like to hear more about it, tune in to my talk at the SANS DFIR Summit on July 16th! If you do, try out this query I used for this blog post to pry around at this one table from this database and let me know how it works on your data via Twitter @bizzybarney

select
 loc_records.id,
 	sources.bundle_id as "Bundle ID",
 	sources.group_id as "Group ID",
 	datetime(sources.seconds_from_1970, 'unixepoch') as "Source Time",
 	loc_records.cll_latitude_degrees || ", "|| loc_records.cll_longitude_degrees as "Coordinates",
 	loc_records.clp_name as "Name",
 	loc_records.clp_thoroughfare as "Road",
 	loc_records.clp_subThoroughfare as "Address #",
 	loc_records.clp_locality as "City",
 	loc_records.clp_subLocality as "Sub-locality",
 	loc_records.clp_administrativeArea as "Admin Area",
 	loc_records.clp_subAdministrativeArea as "Sub Admin Area",
 	loc_records.clp_postalCode as "Postal Code",
 	loc_records.clp_ISOcountryCode as "Country Code",
 	loc_records.clp_country as "Counrty",
 	hex (loc_records.clp_location) as "Location BLOB (hex)",
 	loc_records.extraction_os_build as "iOS Build Version",
 	loc_records.category as "Category",
 	loc_records.algorithm as "Algorithm",
 	loc_records.initial_score as "Initial Score",
 	CASE
 	WHEN loc_records.is_sync_eligible = 1 then "Yes"
	WHEN loc_records.is_sync_eligible = 0 then "No"
	end as "Sync Eligible"
from loc_records  
left join sources on loc_records.source_id=sources.id