Importing Data

The Create Table UI provides a simple way to upload small files into Databricks to get started.


We recommend using Databricks File System - DBFS instead of the Create Table UI to load your data into Databricks in a production manner. You can use a wide variety of Spark Data Sources to import data directly in your notebooks.

Uploading Data

If you have a small file on your local machine that you wish to analyze with Databricks you need to upload it to Databricks File System - DBFS in The FileStore. To do so,

  1. Click Data to open the data panel. Then click the Add Table Icon at the top of the Tables panel.
  1. Upload your file either by dragging it to the dropzone or clicking on dropzone and choosing your files
  2. After upload you will see a path displayed. You can use this path in a notebook to read data into your cluster (see below). This path will be something like /FileStore/tables/2esy8tnj1455052720017/.

Loading Data

You can read your raw data into Spark directly. For example, if you uploaded a CSV, you can read your data using one of these examples.


For easier access, we recommend that you create a table from your uploaded data by clicking Preview Table. See the documentation on Databases and Tables for more information.


val sparkDF ="csv").load("/FileStore/tables/2esy8tnj1455052720017/")


sparkDF ="csv").load("/FileStore/tables/2esy8tnj1455052720017/")


sparkDF <- read.df(sqlContext, source = "csv", path = "/FileStore/tables/2esy8tnj1455052720017/")

Scala RDD:

val rdd = sc.textFile("/FileStore/tables/2esy8tnj1455052720017/")

Python RDD:

rdd = sc.textFile("/FileStore/tables/2esy8tnj1455052720017/")

If the data is small enough, you can also load this data directly onto the driver node. For example:


pandas_df = pd.read_csv("/dbfs/FileStore/tables/2esy8tnj1455052720017/part_001-86465.tsv", header=True)


df = read.csv("/dbfs/FileStore/tables/2esy8tnj1455052720017/part_001-86465.tsv", header = TRUE)

Editing Data

You cannot edit data directly within Databricks but you can overwrite the data file using Databricks File System - DBFS via Databricks Utilities.

Deleting Data

You can use the following Databricks File System - DBFS command to delete the data.


Deleting data cannot be undone.

dbutils.fs.rm("dbfs:/FileStore/tables/2esy8tnj1455052720017/", true)