Update: See my Jupyter Notebook for a better version.
While playing with Python and Geopandas, I wanted to import my GoogleEarth placemarks into a Geopandas GeoDataFrame. This was more complicated than expected. In theory, Geopandas can open and parse KML (the format used by GoogleEarth) by simply calling:
geopandas.read_file()
It is using the library fiona for this task. But this only imports the first folder and my placemarks are organised in many folders. After trying several modules specialising in KML, it turned out easier to parse it myself using the standard Python module minidom. KML is XML anyway. I also created a column with the path of the folder (most of my placemarks are unnamed, the folder names are the most useful information). This is not really fast, but it works.
import pandas as pd import geopandas as gpd import matplotlib.pyplot as plt from xml.dom.minidom import * # Open the KML file dom = parse('travel.kml') # Define a function to get the path of a placemark def subfolders(node): if node.parentNode == dom.documentElement: return "" else: foldername = node.getElementsByTagName("name")[0].firstChild.data path = subfolders(node.parentNode) + "/" + foldername return path # Parse the DOM of the KML # For each Placemark, get a tuple of name, lat, long, foldername and path # Append the tuple to a list of tuples entries = [] placemarks = dom.getElementsByTagName("Placemark") for i in placemarks: longitude = i.getElementsByTagName("longitude")[0].firstChild.data latitude = i.getElementsByTagName("latitude")[0].firstChild.data try: name = i.getElementsByTagName("name")[0].firstChild.data except: name = "" parent = i.parentNode foldername = parent.getElementsByTagName("name")[0].firstChild.data path = subfolders(parent) entries.append((name, latitude, longitude, foldername, path)) # List of tuples df = pd.DataFrame(entries, columns=('name', 'latitude', 'longitude', 'folder', 'path')) gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.longitude, df.latitude, crs="EPSG:4326"))
Now we can have a look at our GeoDataFrame with gdf.head()
, save it to CSV with gdf.to_csv("travel.csv")
and open the CSV again with gdf = gpd.read_file("travel.csv")
.
Now we could, for example, plot them on a simple world map and also plot a convex hull (bounding box) around them.
# Use the natural earth dataset as basemap natworld = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) # For the convex hull, I need a geometry of all placemarks combined combined = gdf.dissolve() # Plot fig, ax = plt.subplots(figsize=(10,5)) natworld.plot(ax=ax, color="darkgrey", edgecolor="lightgrey") gdf.plot(ax=ax, color="blue", marker=".") combined.convex_hull.plot(ax=ax, color="none", edgecolor="red")
The result:
Save the figure:
fig.savefig("bounding-box.png")