Account Requirements

To utilize Userpilot Data Sync, please ensure the following account-related conditions are satisfied:
  • Appropriate Userpilot Plan: Data Sync is an add-on feature. Your Userpilot subscription plan must include access to the Data Sync functionality. If you are unsure about your current plan or wish to enable Data Sync, please contact your Userpilot Account Manager or our support team for assistance.
  • Necessary User Permissions: The user configuring Data Sync within your Userpilot account must have the appropriate administrative permissions. Typically, this requires an admin role or a custom role with specific permissions to access and manage Data Sync settings. Please verify your user role and permissions within the Userpilot dashboard.

Destination Storage Setup

Userpilot Data Sync exports your event data to your chosen cloud storage destination. You will need to have an active account and a configured storage location with one of our supported cloud providers. As of the current documentation, Userpilot Data Sync supports the following:
🔒 Security First: IAM Best PracticesWhen configuring access for Userpilot Data Sync to your cloud storage, always follow the principle of least privilege. Create a dedicated IAM user/role or service account with only the necessary write permissions (e.g., s3:PutObject for S3) to the specific bucket and path prefix. Avoid using root account credentials.
  • Amazon S3: You will need an S3 bucket, the bucket name, its region, and appropriate AWS IAM credentials (Access Key ID and Secret Access Key) that grant Userpilot write access to the specified bucket and path prefix.
  • Google Cloud Storage (GCS): You will need a GCS bucket, your Google Cloud Project ID, and a service account credentials with permissions to write to the bucket. You will also specify a path prefix.
Key considerations for your storage destination:
  • Ensure the storage location (bucket/container) is created in the region that best suits your data residency and performance needs.
  • Configure access permissions meticulously. Userpilot will require write access to the specified path within your storage to deliver the data files. It is best practice to create dedicated credentials or roles with the principle of least privilege.
  • Note down all necessary details (bucket/container names, regions, access keys, path prefixes, etc.) as you will need them during the Data Sync configuration process in Userpilot.

Technical Considerations

While Userpilot Data Sync is designed to be user-friendly, some technical understanding can be beneficial for a seamless experience and effective utilization of the synced data:
  • Understanding Data Formats: Be aware of the data format in which Userpilot will deliver the files (e.g., Avro, JSON, Parquet). Understanding the structure of these formats will be crucial for parsing and ingesting the data into your downstream systems.
  • Familiarity with Data Warehousing and ETL/ELT: While not strictly required to set up the sync, a basic understanding of data warehousing concepts and ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes will be highly beneficial for your data teams who will be consuming and analyzing the synced data.
🛠️ Prepare Your Data EnvironmentBefore using Data Sync, make sure your data team is ready. Familiarity with your data format (e.g., Avro, JSON, Parquet), cloud storage, and ETL tools will make syncing and analyzing Userpilot data much smoother.
  • Network Configuration: In most scenarios, Userpilot will connect to your cloud storage provider’s public endpoints, and no special network configurations are needed. However, if your organization has strict outbound firewall policies, ensure that Userpilot’s egress IP addresses are whitelisted to access the necessary cloud storage APIs.
  • Data Volume and Storage Costs: Consider the volume of event data your application generates. Regular data syncs can accumulate significant amounts of data over time. Be mindful of the storage costs associated with your chosen cloud provider and implement appropriate data lifecycle management policies on your storage bucket/container if necessary.