The Databricks Export app for PostHog will push data from PostHog to Databricks, once every minute. The app creates a table and migrates data from DBFS to a database.
Requirements
Using this app requires either PostHog Cloud, or a self-hosted PostHog instance running version 1.30.0 or later.
Not running 1.30.0? Find out how to update your self-hosted PostHog deployment!
Installation
- Visit the 'Apps' page in your instance of PostHog.
- Search for 'Databricks' and select the app, press Install.
- Follow the steps below to configure the app.
Configuration
You will need the following, in order to full configure this app:
- Domain name of the cluster, provided by Databricks
- An API key, generated by following this documentation
- Your cluster ID, found in the system documentation
You will also need to give a temporary filename path for saving raw data, and a database name for where you want to store the data. Enter events in comma ( , ) separated way in order to ignore the data.
Option | Description |
---|---|
Domain Name Type: string Required: True | Domain name of your databrics cloud instance |
API Key Type: string Required: True | API key of your databrics cloud instance |
File Name Type: string Required: True | Default filename for the csv file |
Python Support File Upload Type: string Required: True | Default filename for python job file |
Cluster Id Type: string Required: True | Get cluster details from the databricks portal |
Database Name Type: string Required: True | Database name to store the data |
Events to Ignore Type: string Required: False | Events to ignore |
Limitations
The Databricks Export app cannot currently sync historic data, or change the frequency with which it pushes data to PostHog.
Interesting in contributing to the app to remove these limitations? Check the GitHub repo!
FAQ
Is the source code for this app available?
PostHog is open-source and so are all apps on the platform. The source code for the Databricks Export app is available on GitHub.
Who created this app?
We'd like to thank community members Sandeep Guptan and Himanshu Garg for their work creating this app. Thank you, both!
Where can I find out more?
Check Databricks' API documentation for more information on pulling and pushing data from/to Databricks.
Who maintains this app?
This app is maintained by the community. If you have issues with the app not functioning as intended, please let us know!
What if I have feedback on this app?
We love feature requests and feedback! Please tell us what you think! to tell us what you think.
What if my question isn't answered above?
We love answering questions. Ask us anything via our community forum, or drop us a message.