Pre-download a Large Realm Dataset

cloud

(Wesley Vance) #1

Is it possible to pre-download the synced cloud realm, to be downloaded with the apps binary from its respective app store?

I have a large read-only dataset which is about 300mb, I’d like to have this entire dataset available on each users device for offline use. But at 300mb, this will quickly chew up the allotted 20gb of bandwidth provided on the initial cloud plan. Is there any way of offloading the initial download of the database to another source, either in the binary OR a S3 download, or other source?

How would one architect an application with a large (300mb) dataset that cannot be easily broken up into subsets of smaller Realms, while still wanting to utilize syncing and offline support?

Thanks!
-Wes


(jay) #2

Is this a pre-populated Realm file or something else?


(Wesley Vance) #4

I have a pre-populated realm cloud instance, with a ‘master’ list of all the data I’ll need in the application via a full-sync. BUT If I full sync this list of ‘markers’, its going to be transferring a lot of data (+300mb) and thus using a lot of bandwidth. For my particular application, I need all the data in the Realm (+300mb) available offline so a query realm or some other filter / breakup of realms isnt an option (theres no clear way to separate the data based on time, location, etc). My understanding is when I connect to a cloud realm and sync the device, it will download the entire realm in the background (good), but with a realm of this size (+300mb), its going to chew through my bandwidth allotment.

Is there any way to ‘pre-download’ this data from a realm cloud instance without using the bandwidth allotment? IE: Could I pre-package some/all of the data in the realm cloud instance and include it in the binary of the app package, OR download a .Realm file from some storage via S3, and then sync the two databases via the Realm Cloud? Doing it this way would allow me to have more than 100 users before using up my bandwidth allotment.

Thanks!


#5

My first thought is that you might bundle a populated Realm Database with the app. On first run the app would then copy data from Realm Database.

However I don’t think that would solve your problem because sync data is timestamped and so everything would still be synced, I think.


(jay) #6

@Nosl Good call. Bundle the Realm with with app - if it’s bundled it will be read only. Once the app starts, create your Realm (cloud sync) and copy the data you need from the bundle to the newly created Realm.

Optionally, you could leverage a solution like Firebase Storage to store the Realm file and download it from there upon app start. I would go with the first option though.


#7

That doesn’t solve the OPs issue with bandwidth though @jay. Each new install will still have to sync 300mb. I don’t think there is a way around that, so they might as well not bundle the data and sync on first run.


(jay) #8

@Nosl If it’s part of the app it will be zero bandwidth as the App would be, for example, downloaded from the app store and have all of the data in it - there would be no Realm usage at that point.

The bandwidth usage would start after the App was downloaded and once the new Realm was instantiated and data moved from the bundled Realm to the new one - it would then start sync’ing.

It doesn’t sound like all of the data would be copied over

I need all the data in the Realm (+300mb) available offline

so the only Realm Cloud bandwidth would be however much data is sync’d from the new Realm.

That being said, if this is read only data, it could just stay bundled with the app and would results in zero Realm bandwidth usage as it wouldn’t sync as it’s read only.


#9

@jay Although the pre-bundled data is read-only, I am assuming that data is replicated on Realm Cloud. Perhaps @wesley.vance can clarify?

Realm Cloud Sync will not know that data is already on the user devices is the same until it syncs the data on the user device with Realm Cloud. Then the local data will be timestamped and as it is read-only, will not be synced again.

If the local data is identical with the data on Realm Cloud, will Realm Cloud Sync actually sync all the data before adding a timestamp, or will it recognise that it is identical and just add a timestamp?


(jay) #10

@Nosl I understand what you’re saying. However, if it’s bundled, there would be no reason to also sync it, and a bundled Realm cannot be sync’d. So, the only data that would sync would be data copied to a new Realm.


(Wesley Vance) #11

@jay & @Nosl Thanks for your help!

So your suggestion would be to bundle a realm file with the app, then upon app opening, open a synced realm with ‘changes since the last bundled data’, and pull that data into the app and merge locally. Once every few months, I would move the data from the ‘updated data’ and move it over to the bundled version - and clear the updated synced realm? This works, but seems tedious and a fair bit of maintenance work on the app.

When using realm in production, 20gb of bandwidth doesnt seem like that much data? If you have 10k app users connecting to a full-synced realm, just 20mb of updated data would send us over the bandwidth limit? The pro version which expands this to 30gb for a huge jump in cost, only gives us 30mb for the same set.

So, how would you ever use fully synced realms (and query synced realms are not suggested for production?) where all your users have access to all the data and changes? Is my use-case unique, or are others building more complex infrastructure around their realm syncing?

Thanks a ton!


#12

If you import your 300mb of data into Realm Database, is it still 300mb? We’ve found Realm to be quite space efficient. It would be nice to know how big the data is in Realm format. That will give you a better idea of how big your issue is and how complex the solution needs to be.


(Wesley Vance) #13

Yeah its a 300mb Realm file, so its larger in other formats, but my local Realm file is hitting 300mb.