Databricks is a cloud data lakehouse that runs on AWS, Azure, or GCP and serves as a central store for operational, transactional, and financial data at companies in retail, healthcare, financial services, and manufacturing. Connecting Databricks to Parabola lets ops, finance, and supply-chain teams query lakehouse tables directly and feed the results into automated workflows, without writing code or filing a ticket with data engineering.Documentation Index
Fetch the complete documentation index at: https://parabola.io/docs/llms.txt
Use this file to discover all available pages before exploring further.
Pull from Databricks
The Pull from Databricks step runs a SQL query against your Databricks SQL Warehouse and loads the result into your flow. You write the query, pick the warehouse to run it on, and Parabola handles the rest.How to authenticate
The step uses personal access token authentication. You need two things from your Databricks workspace: your workspace URL and a personal access token.Find your workspace URL
Your workspace URL is the address you see in the browser when you’re logged into Databricks. The format depends on your cloud provider:- AWS:
https://dbc-a1b2345c-d6e7.cloud.databricks.com - Azure:
https://adb-5555555555555555.19.azuredatabricks.net - GCP:
https://1234567890123456.7.gcp.databricks.com
? or # characters.
Generate a personal access token in Databricks
Enter a description (e.g., “Parabola integration”) and set a lifetime in days, then click Generate.
Connect in Parabola
Parabola stores your credentials securely so you don’t need to re-authenticate on every run.
Available data
The Pull from Databricks step lets you run any SQL query against your Databricks tables and pull the results into your flow. Because you write the query, you can access anything stored in the workspace your token can read:- Any table or view in your data lakehouse — query tables across catalogs and schemas using standard SQL, whether that’s order records, customer data, inventory levels, financial transactions, or anything else your data team has made available.
- Aggregated and transformed data — write queries with filters, joins, groupings, and calculations so you pull exactly the data you need rather than full raw tables.
- Saved analyst queries — re-run a query your data team has already written for a dashboard, then route the same output through Parabola to a new destination.
Common use cases
- Reconcile orders across systems: Pull order data from Databricks and compare it against Shopify, Amazon Seller Central, or NetSuite to flag missing orders, status mismatches, or revenue gaps before they hit customers.
- Automate recurring ops reports: Pull key metrics from Databricks on a schedule, format the result in Parabola, and route it to Slack, Google Drive, or Smartsheet without manual exports.
- Monitor fulfillment and supply-chain performance: Combine Databricks data with carrier records from FedEx, UPS, or ShipStation, or with WMS records from ShipBob or ShipHero, to flag SLA breaches and inventory shortfalls.
- Power financial reconciliation: Pull transaction data, revenue figures, or cost records from Databricks and match them against invoices, purchase orders, or GL entries from QuickBooks Online, NetSuite, or SuiteQL.
- Build cross-platform dashboards: Merge Databricks analytics with live operational data from Gorgias, Zendesk, or HubSpot so teams work from a single view.
- Push clean data to working tools: Query and transform Databricks data in Parabola, then push the result into Airtable, Smartsheet, or Google Drive for the team that actually uses it.
- Sync to other warehouses: Move clean, filtered slices of Databricks data into Snowflake, BigQuery, or Redshift for downstream BI without copying full tables.
Tips for using Parabola with Databricks
- Write focused queries with filters. Add
WHEREclauses for date ranges or statuses to pull only the rows you need. AvoidSELECT *on multi-million row tables. - Match cadence to use case. Hourly for time-sensitive ops monitoring, daily for standard reporting, weekly for performance summaries.
- Ask your data team for the first query. Once the SQL is in your flow, it runs on its own with no further data team involvement, so the up-front help pays off.
- Combine Databricks with other sources. Pull inventory from Databricks and merge it with Shopify orders, UPS tracking, or Gorgias tickets in the same flow.
- Find your SQL Warehouse ID in the Databricks sidebar by going to SQL Warehouses, clicking the warehouse name, and opening the Connection Details tab.
- Track token expiration. Personal access tokens expire on the lifetime you set. Note the date and rotate before it lapses so flows don’t fail silently.
FAQ
Can I push data back into Databricks?
The Pull from Databricks step is read-only. To write to a Databricks table, use a Send to an API step pointed at the Databricks SQL Statement Execution API with your personal access token, or push the output into cloud storage that your Databricks job ingests.Which clouds are supported?
All three: AWS, Azure, and GCP. The workspace URL format differs by cloud (see above), but the auth and step configuration are the same.Why is my query timing out?
SQL Warehouses can be paused when idle and take a minute to spin back up. If your query is slow, check the warehouse’s auto-stop setting with your data team, or pick a larger warehouse for queries that scan large tables. Also confirm you’re filtering on partitioned columns where applicable.Do I need Unity Catalog?
No. The step works with both Unity Catalog and the Hive metastore. As long as your SQL Warehouse can resolve the table reference in your query, Parabola can pull the result.Can I use service principals instead of a personal access token?
Today the step authenticates with personal access tokens. For a service-account-style setup, generate the token under a dedicated user in your Databricks workspace and rotate it on a schedule.With Databricks and Parabola connected, the operational reports, reconciliations, and alerts that depend on lakehouse data run on a schedule, with output landing in the systems where your team actually works.