Large Scale Image Processing with Spark through Fugue
· 6 min read
How Clobotics Runs Distributed Image Processing
How Clobotics Runs Distributed Image Processing
Profiling large-scale data for use cases such as anomaly detection, drift detection, and data validation.
Examining the limitations of the SQL interface for distributed computing workflows.

A deep look at the assumptions of the Pandas interface

Increase developer productivity and decrease costs for big data projects

Run PyCaret functions on each partition of data distributedly

Increase developer productivity and decrease compute usage for big data projects

Connecting FugueSQL with Databricks Connect

We can do this the easy way, or the hard way

How to bring Pandas libraries to Spark and Dask with Fugue