DataBricks provides a just-in-time data platform on top of Apache Spark that empowers anyone to easily build and deploy advanced analytic solutions. Start by connecting directly to your existing storage with orchestrated Apache Spark clusters and DataBricks managed services in the cloud.
The DataBricks integrated workspace allows you to create dashboards and collaborate with easy to use notebooks. You can also use third-party Business Intelligence tools, as well as, create your own custom Spark applications and production jobs. It all starts with Apache Spark - an open source, fast and general engine for large-scale data processing - including Spark SQL, streaming, machine learning and graph.
DataBricks, Apache Spark are trademarks.
DataBbricks enhances Apache Spark by providing an easy, yet powerful managed service for data scientists and data engineers. With just a few clicks you can start a cluster. You can choose which version of Spark, including pre-release versions, when available. You can also choose on-demand, spot or mixture of instances so you can scale on-demand.
Data scientists, engineers and analysts can instantly visualize the results within the integrated workspace instead of exporting it out for visualization. But it's not just about bar charts - there are a number of different visualizations available within DataBricks.
With Spark SQL query you can switch to line, to area, to pie, and to map - all with just a click. With a few clicks and keystrokes you can add or revoke user access to your notebooks, as well. You can specify what access rights other users have. Once you have secured your notebooks, your collaborators can edit or provide comments. You only need to highlight and click on the comments icon to reply or create your own comments.
With DataBricks you can transition from development to production in a matter of clicks by using the data bricks jobs feature. For example, instead of rewriting your notebook you can run it as a job.
Start by selecting the notebook you would like to have scheduled. To run your periodic job, you can choose to use an existing cluster, one you keep continuously running or build a new cluster, where the cluster is automatically created and terminated for the job. With a few clicks you have a scheduled job that provides easy access to the active and completed runs.
DataBricks is advanced Analytics made easy - from ingest to production, with Apache Spark.