Apache Sedona
Data analysis software
From Wikipedia, the free encyclopedia
Apache Sedona (formerly GeoSpark) is an open-source framework designed for processing and analyzing large-scale spatial data in a distributed computing environment.[1][2] It originated as GeoSpark in 2010 by researchers at Arizona State University[3] and later entered incubation with the Apache Software Foundation in 2020. It graduated as a top-level project in February 2023.[4]
| Apache Sedona | |
|---|---|
| Other names | GeoSpark |
| Original authors | Jia Yu, Mohamed Sarwat |
| Developer | Apache Software Foundation |
| Initial release | December 10, 2017 |
| Available in | Scala, Java, SQL, Python, R, |
| License | Apache 2.0 |
| Website | sedona |
| Repository | https://github.com/apache/sedona |
Overview
Sedona is a framework that facilities distributed geospatial data processing. It integrates with Apache Spark, Apache Flink, Snowflake[5][6] and includes Spatial Datasets and Spatial SQL functions to loading, processing, and analyzing large-scale geospatial data across systems.[7] It supports spatial data formats, including GeoJSON, Well Known Text and Well-Known Binary[8][9] as well as multiple coding languages, including Java, Python, R, Scala, and SQL.[10][11]