Hey there, fellow developer! Ready to supercharge your PHP project with Databricks? You're in for a treat. Databricks API is a powerhouse for managing your data analytics and machine learning workflows. And guess what? We're going to make it even easier with the codibly/databricks-bundle
package. Let's dive in!
Before we get our hands dirty, make sure you've got:
Got all that? Great! Let's move on.
First things first, let's get that package installed. Fire up your terminal and run:
composer require codibly/databricks-bundle
Easy peasy, right?
Now, let's set up those API credentials. Create a .env
file if you haven't already, and add:
DATABRICKS_HOST=your-workspace-url
DATABRICKS_TOKEN=your-access-token
Next, configure the bundle in your PHP project. If you're using Symfony, add this to your config/packages/databricks.yaml
:
databricks: host: '%env(DATABRICKS_HOST)%' token: '%env(DATABRICKS_TOKEN)%'
For other frameworks, you'll need to load these environment variables yourself. No sweat!
Time to get that Databricks client up and running:
use Codibly\DatabricksBundle\DatabricksClient; $client = new DatabricksClient($host, $token);
Let's make your first API call:
$clusters = $client->cluster()->list();
Boom! You've just listed all your Databricks clusters. How cool is that?
Now that you're rolling, let's look at some common operations:
// Create a cluster $clusterId = $client->cluster()->create([ 'cluster_name' => 'My Awesome Cluster', 'spark_version' => '7.3.x-scala2.12', 'node_type_id' => 'i3.xlarge', 'num_workers' => 2 ]); // Start a cluster $client->cluster()->start($clusterId); // Terminate a cluster $client->cluster()->delete($clusterId);
// Create a job $jobId = $client->jobs()->create([ 'name' => 'My Cool Job', 'new_cluster' => [ 'spark_version' => '7.3.x-scala2.12', 'node_type_id' => 'i3.xlarge', 'num_workers' => 2 ], 'notebook_task' => [ 'notebook_path' => '/Users/[email protected]/My Notebook' ] ]); // Run a job $runId = $client->jobs()->runNow($jobId);
// List workspace contents $contents = $client->workspace()->list('/Users/[email protected]'); // Import a notebook $client->workspace()->import('/Users/[email protected]/New Notebook', 'PYTHON', 'SOURCE', file_get_contents('my_notebook.py'));
// List DBFS contents $files = $client->dbfs()->list('/'); // Upload a file $client->dbfs()->put('/my_file.txt', file_get_contents('local_file.txt'));
Always wrap your API calls in try-catch blocks:
try { $result = $client->cluster()->list(); } catch (\Exception $e) { // Handle the error echo "Oops! " . $e->getMessage(); }
And remember, Databricks has rate limits. Be nice to the API, and it'll be nice to you!
Want to customize your API requests? No problem:
$client->setHttpClient(new \GuzzleHttp\Client([ 'timeout' => 30, 'verify' => false ]));
Don't forget to test your integration! Here's a quick example using PHPUnit:
use PHPUnit\Framework\TestCase; class DatabricksIntegrationTest extends TestCase { public function testClusterList() { $client = new DatabricksClient($host, $token); $clusters = $client->cluster()->list(); $this->assertIsArray($clusters); } }
And there you have it! You're now a Databricks API integration ninja. Remember, this is just scratching the surface. The Databricks API has tons more to offer, so don't be afraid to explore.
For more details, check out the codibly/databricks-bundle documentation and the official Databricks API docs.
Now go forth and build something awesome! Happy coding!