Cloud Platform Query-Based Sync Performance


#1

I think I understand the lazy loading of data for ROS locally, but I am unclear if the same applies on Cloud Platform while using query-based sync.

Let’s say we have a hypothetical nested object hierarchy like Country->State->County->City->Person. Each is a Realm object with a 1:N IList of its children. A User is associated with a Country. When the User logs-in, we do a query-based sync to load the info for their Country. Does Realm then load the entire object hierarchy below that country at that time? Or is that lazily loaded?

I would be fine with Country, State, County, and even City being pre-loaded. But I do not want every Person in the Country pre-loaded. The concern is obviously memory usage, load performance, etc. Should we break the forward relationship between City and Person? Is this a non-issue because of lazy loading? Is there some entirely different way to handle the situation?

Thanks in advance for your insight!


#2

The objects and relationships are a little unclear (to me). Can you provide an example of your objects and their relationships and then how a ‘user’ is loaded (I assume that’s the same thing as a ‘Person’?). The answer will depend a lot on hour these objects ‘see’ each other.


#3

@bsohl So lazy loading and memory mapping apply to accessing the realm on-disk whether it is synced or non-synced these techniques apply in both uses. See here for in depth talk -

Query-based sync (or full sync) is a method for creating that realm on the local device’s disk (at which point the realm behaviors the same as if it was non-synced).

In query-based sync ALL child objects are pulled in via forward or direct links when you make a subscription on the parent. In your case I would suggest either inverting the relationship OR you can use backlinks to refer to a Person object from a City. In this way you can pull in all of the Location information and then create a separate subscription for an individual Person.

https://docs.realm.io/sync/using-synced-realms/syncing-data#query-based-synchronization


#4

Thanks, that’s the info I needed. We will probably just cut the forward link from City->Person and keep just the backlink from City<-Person. The rest of the bi-directional links we will probably keep as-is. It is SO convenient to be able to just subscribe to our root object and then have everything down the entire object tree sync’ed auto-magically.


#5

Is it a problem to subscribe to x00k elements? Can’t realm manage that number?

I have a similar issue but i would like to have all my data stored offine on the device. I tought that Realm could manage that number.


#6

@au.petrone That’s fine - the sync time scales linearly with the amount of objects and the complexity of your queries. So based on your requirements you can tune the query/subscription.


#7

@ianward 'm using react native and I have two big problems:

  1. When I subscribe to a collection with reference and linking objects (direct links and backlinks) of approximately 1.5M (parents and children) of objects my App crashes

  2. When I create those objects directly on the device the write operations became very slow (5-6s)

Plus I use listeners for changes on those objects for the UI updates


#8

Sounds like that’s too much data for your device to handle if it crashes with an out of memory or mmap error. You should pare down the data so that it can be handled by the resources of your platform.

When I create those objects directly on the device the write operations became very slow (5-6s)

That’s seems slow but it depends on the amount of objects and the complexity of the object graph. A good rule of thumb is around 10k objects per write transaction. If that is still too slow then you should continue paring down the number until you get the responsiveness you want.


#9

I and others have had the same issue - see this GitHub ticket. I hope we get a better answer soon. My Realm has about 500k objects and I can’t query based sync all of it without Realm Core crashing. I feel like the only limit on Realm size should be available disk. Awaiting a response.


#10

@ianward

In the write transaction I write online 3-4 objects at time. The problem si that those objects are in a large collection with change listener on that

@harrynetzer @ianward

Regarding the open of a large database it seems not related to the Device (tryed on iPhone XS). It freeze the js thread when I open (or openAsync) the realm and it crash without any log


#11

@harrynetzer The issue you are having is that the query is downloaded as one atomic transaction - the data is held in memory until the entire transaction is completed. Once the download is completed the entire transaction is memory-mapped and committed to disk - the memory-mapping could easily take 4-8x the actual state of the query in memory. You need to shrink your query and do it in pieces. See details of our sync protocol here -

https://docs.realm.io/sync/realm-sync-a-primer/realm-sync-the-details


#12

Thanks for the suggestion. Breaking it up into pieces gave me a much nicer memory usage graph.


#13

Can you please tell me how did you broken them?


#14

I can submit a snippet later, but what I did was keep a stack of integers representing partitions of the table. I have a recursive function q which pops the next integer, makes a sync query and a token which calls q when sync completes. If the stack is empty q returns.


#15

I would love to see the snippet, thank you!


#16

Hope this helps (Swift)

class Example {
    var stack = [1,2,3,4]
    let realm: Realm
    var locationsSubscriptions = [SyncSubscription<Employee>]()
    var locationsTokens = [NotificationToken]()
    
    func syncNext() {
        if let next = self.stack.popFirst() {
            let locSub = realm.objects(Employee.self).filter("rank = %d", next).subscribe(named: String(next))
            self.locationsSubscriptions.append(locSub)
            self.locationsTokens.append(locSub.observe(\.state) { state in
                if state == .complete {
                    self.syncNext()
                }
            })
        } else {
            // sync complete
        }
    }
}

#17

So we need to insert (or find a good one) a column ad filter by that. Isn’t possible to do something like ‘limits’ in sql? The docs says to use splice, but I guess that you can’t subscribe to ‘spliced’ results…


#18

There is also a limit parameter you can specify in your queries to limit the amount of data pulled down

https://docs.realm.io/sync/using-synced-realms/syncing-data#limiting-subscriptions