Object Segmentation and Detection

Contents

Object Segmentation and Detection#

The objective of this use case is to showcase the use of prompt object segmentation and detection.

Prompt Object Segmentation using SAM on Sentinel-2 Images#

1- Create an ROI to download Sentinel-2 images using QGIS in EPSG 4326.

2- Create a bounding box around the object of interest.

3- Perform prompt segmentation with SAM API.

4- Visualize the obtained results.

Object Detection using OD API on VHR Images#

1- Visualize some examples from the dataset.

2- Send POST requests to OD API to detect objects in the images.

3- Visualize results.

Object Detection and Segmentation using OD and SAM APIs on VHR Images#

1- Detect OBB using OD API.

2- Convert oriented bounding boxes to axis-aligned bounding boxes.

3- Visualize AABB on the images.

4- Send request to SAM API with the AABB as a prompt.

If you are running the notebook on Pangeo jupyterlab, run the following code to install some additional python libraries#

! pip3 install gdown opencv-contrib-python==4.7.0.72 opencv-python==4.7.0.72 opencv-python-headless==4.7.0.72
Requirement already satisfied: gdown in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (5.2.0)
Requirement already satisfied: opencv-contrib-python==4.7.0.72 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (4.7.0.72)
Requirement already satisfied: opencv-python==4.7.0.72 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (4.7.0.72)
Collecting opencv-python-headless==4.7.0.72
  Downloading opencv_python_headless-4.7.0.72-cp37-abi3-macosx_11_0_arm64.whl.metadata (18 kB)
Requirement already satisfied: numpy>=1.21.2 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from opencv-contrib-python==4.7.0.72) (1.26.4)
Requirement already satisfied: beautifulsoup4 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from gdown) (4.12.3)
Requirement already satisfied: filelock in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from gdown) (3.14.0)
Requirement already satisfied: requests[socks] in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from gdown) (2.31.0)
Requirement already satisfied: tqdm in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from gdown) (4.66.4)
Requirement already satisfied: soupsieve>1.2 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from beautifulsoup4->gdown) (2.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from requests[socks]->gdown) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from requests[socks]->gdown) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from requests[socks]->gdown) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from requests[socks]->gdown) (2024.2.2)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/myvenv/lib/python3.10/site-packages (from requests[socks]->gdown) (1.7.1)
Downloading opencv_python_headless-4.7.0.72-cp37-abi3-macosx_11_0_arm64.whl (32.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 32.6/32.6 MB 9.1 MB/s eta 0:00:0000:0100:01m
?25hInstalling collected packages: opencv-python-headless
Successfully installed opencv-python-headless-4.7.0.72

[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: pip install --upgrade pip

Required imports#

import os
import json
import requests
import glob
import gdown
import zipfile
import numpy as np
import geopandas as gpd
import tifffile as tiff
from matplotlib import pyplot as plt
from matplotlib import patches
import folium
from utils import geometry_to_coords,save_image_from_url,geometry_to_xy,geographic_to_pixel_bbox,get_filename_from_url,download_txtfile_from_url,obb_to_aabb,read_annotation_file
plt.ion()
<contextlib.ExitStack at 0x1668970a0>

URLs of the APIs to be used#

sentinel_api = "http://sentinel-api-test.dev.apps.eo4eu.eu"
sam_api = "http://sam-api-test.dev.apps.eo4eu.eu"
od_api = "http://od-api-test.dev.apps.eo4eu.eu"

Headers for the API requests#

headers = {
    'accept': 'application/json',
    'access-token': 'YOUR API KEY',
    'Content-Type': 'application/json'
}

Data directory#

data_path = "data/object_detection"
os.makedirs(data_path,exist_ok=True)

WORKFLOW1 Prompt Image Segmentation using SAM API and Sentinel-2 Images#

Create a shapefile on QGIS to define the ROI in EPSG 4326. Then read this file and convert the geomerty to xy coordinates to prepare for the API call#

roi_gdf = gpd.read_file("data/object_detection/sentinel_roi/palm_roi.shp")
geometry = roi_gdf.geometry.iloc[0]
coords = geometry_to_coords(geometry)
coords
[[54.939335770738644, 24.963109041667348],
 [54.939335770738644, 25.06510904166735],
 [55.04333577073864, 25.06510904166735],
 [55.04333577073864, 24.963109041667348],
 [54.939335770738644, 24.963109041667348]]
coords_lat_lon = [[lat, lon] for lon, lat in coords]
center_lat = sum(lat for lat, lon in coords_lat_lon) / len(coords_lat_lon)
center_lon = sum(lon for lat, lon in coords_lat_lon) / len(coords_lat_lon)
m = folium.Map(location=[center_lat, center_lon], zoom_start=13)
folium.TileLayer('openstreetmap').add_to(m)
folium.TileLayer(
        tiles = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}',
        attr = 'Esri',
        name = 'Esri Satellite',
        overlay = False,
        control = True
       ).add_to(m)
folium.Polygon(
    locations=coords_lat_lon,
    color='blue',
    fill=True,
    fill_color='blue',
    fill_opacity=0.2
).add_to(m)
folium.LayerControl().add_to(m)
m
Make this Notebook Trusted to load map: File -> Trust Notebook

Call the Sentinel API to get RGB images of the ROI. The required parameters are:#

  • Geometry

  • Start date

  • End date

  • Cloud cover

  • Index

data = {
    "geometry":coords,
    "start_date":"2023-06-05",
    "end_date":"2023-06-08",
    "cloud_cover":"10",
    "index":"RGB"  
}
response = requests.post(os.path.join(sentinel_api,"api/v1/s2l2a/roi/process"), headers=headers, json=data)
print(response.text)
task_id = json.loads(response.content.decode())
{"task_id":"89839370-4215-499f-9d94-18f31b27b7a9"}

Monitor task status#

params = {'task_id': task_id["task_id"]}
response = requests.get(os.path.join(sentinel_api,"api/v1/task/status"), headers=headers, params=params)
print(response.text)
{"task_id":"89839370-4215-499f-9d94-18f31b27b7a9","state":"SUCCESS","result":"[{'image': 'L2/tiles/39/R/ZH/S2B_MSIL2A_20230607T064629_N0509_R020_T39RZH_20230607T093051.SAFE', 'processed': True, 'url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T39RZH_20230607T064629_RGB.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=Hcqm1Dh7Hdg%2FSrhgE1WDVLUko1M%3D&Expires=1719431249', 'uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T39RZH_20230607T064629_RGB.tif'}, {'image': 'L2/tiles/40/R/CN/S2B_MSIL2A_20230607T064629_N0509_R020_T40RCN_20230607T093051.SAFE', 'processed': True, 'url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RCN_20230607T064629_RGB.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=XlCPO%2FsONqMLzT7NM6BS5ypRFus%3D&Expires=1719431254', 'uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RCN_20230607T064629_RGB.tif'}, {'image': 'L2/tiles/40/R/BN/S2B_MSIL2A_20230607T064629_N0509_R020_T40RBN_20230607T093051.SAFE', 'processed': True, 'url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RBN_20230607T064629_RGB.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=tSNd0dfpHUrMdzoFIgFirnO4DeI%3D&Expires=1719431259', 'uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RBN_20230607T064629_RGB.tif'}]"}

Get results using task ID#

params = {'task_id': task_id["task_id"]}
response = requests.get(os.path.join(sentinel_api,"api/v1/s2l2a/roi/process"), headers=headers, params=params)
results = response.json()
print (results)
{'results': [{'image': 'L2/tiles/39/R/ZH/S2B_MSIL2A_20230607T064629_N0509_R020_T39RZH_20230607T093051.SAFE', 'processed': True, 'url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T39RZH_20230607T064629_RGB.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=Hcqm1Dh7Hdg%2FSrhgE1WDVLUko1M%3D&Expires=1719431249', 'uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T39RZH_20230607T064629_RGB.tif'}, {'image': 'L2/tiles/40/R/CN/S2B_MSIL2A_20230607T064629_N0509_R020_T40RCN_20230607T093051.SAFE', 'processed': True, 'url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RCN_20230607T064629_RGB.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=XlCPO%2FsONqMLzT7NM6BS5ypRFus%3D&Expires=1719431254', 'uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RCN_20230607T064629_RGB.tif'}, {'image': 'L2/tiles/40/R/BN/S2B_MSIL2A_20230607T064629_N0509_R020_T40RBN_20230607T093051.SAFE', 'processed': True, 'url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RBN_20230607T064629_RGB.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=tSNd0dfpHUrMdzoFIgFirnO4DeI%3D&Expires=1719431259', 'uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RBN_20230607T064629_RGB.tif'}]}

Save results from signed URL to local files#

rgb_output_path = os.path.join(data_path,"sentinel_rgb_images")
os.makedirs(rgb_output_path,exist_ok=True)
list_downloaded_imgs = []
for res in results["results"]:
    file_name = get_filename_from_url(res["url"])
    output_file_path = os.path.join(rgb_output_path,file_name)
    save_image_from_url(res["url"],output_file_path)
    list_downloaded_imgs.append(output_file_path)
Image successfully saved to data/object_detection/sentinel_rgb_images/T39RZH_20230607T064629_RGB.tif
Image successfully saved to data/object_detection/sentinel_rgb_images/T40RCN_20230607T064629_RGB.tif
Image successfully saved to data/object_detection/sentinel_rgb_images/T40RBN_20230607T064629_RGB.tif

Visualize results#

num_images = len(list_downloaded_imgs)
cols = 3  
rows = (num_images // cols) + (num_images % cols > 0) 
plt.figure(figsize=(15, 5 * rows)) 
for i, image_path in enumerate(list_downloaded_imgs):
    if not os.path.exists(image_path):
        print(f"Image not found: {image_path}")
        continue
    img = tiff.imread(image_path)
    plt.subplot(rows, cols, i + 1)
    plt.imshow(img.astype("uint8"))
    plt.axis('off') 
    plt.title(os.path.basename(image_path))
plt.tight_layout()
plt.show()
_images/80bc6c405392625f932b056e0206a26f3dea843b230c05b1501e4d27d0ba7d99.png

Read the bounding box and convert it to x_min,y_min,x_max,y_max#

bbox_gdf = gpd.read_file("data/object_detection/sentinel_roi/bbox.shp")
list_bboxes = geometry_to_xy(bbox_gdf)
list_bboxes
[[54.94884426946525,
  24.969402298985127,
  55.034844269465246,
  25.04040229898513]]
min_lon, min_lat, max_lon, max_lat = list_bboxes[0]
coords_lat_lon = [[lat, lon] for lon, lat in coords]
coords_lat_lon_bbox = [
    [min_lat, min_lon],
    [min_lat, max_lon],
    [max_lat, max_lon],
    [max_lat, min_lon],
    [min_lat, min_lon]  
]
center_lat = sum(lat for lat, lon in coords_lat_lon) / len(coords_lat_lon)
center_lon = sum(lon for lat, lon in coords_lat_lon) / len(coords_lat_lon)
m = folium.Map(location=[center_lat, center_lon], zoom_start=13)
folium.TileLayer('openstreetmap').add_to(m)
folium.TileLayer(
        tiles = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}',
        attr = 'Esri',
        name = 'Esri Satellite',
        overlay = False,
        control = True
       ).add_to(m)
folium.Polygon(
    locations=coords_lat_lon,
    color='blue',
    fill=True,
    fill_color='blue',
    fill_opacity=0.2
).add_to(m)
folium.Polygon(
    locations=coords_lat_lon_bbox,
    color='red',
    fill=True,
    fill_color='red',
    fill_opacity=0.2
).add_to(m)
folium.LayerControl().add_to(m)
m
Make this Notebook Trusted to load map: File -> Trust Notebook

Convert geographic coordinates to pixel coordinates#

img = tiff.imread(list_downloaded_imgs[2])
roi_bbox = geometry_to_xy(roi_gdf)[0]
pixel_bboxes = geographic_to_pixel_bbox(np.array(list_bboxes),img.shape[1],img.shape[0],roi_bbox[1],
            roi_bbox[3],
            roi_bbox[0],
            roi_bbox[2])
pixel_bboxes
array([[  97,  277,  980, 1076]])

Visualize the image and the pixel bounding box for verification#

fig, ax = plt.subplots(1)

# Display the image
ax.imshow(img.astype("uint8"))

# Add the bounding boxes
for box in pixel_bboxes:
    x_min, y_min, x_max, y_max = box
    width = x_max - x_min
    height = y_max - y_min
    rect = patches.Rectangle((x_min, y_min), width, height, linewidth=2, edgecolor='r', facecolor='none')
    ax.add_patch(rect)

# Display the plot
plt.show()
_images/44e306c9340d3834648201f6717324ab98d2b653edbdefcc30114279361a9675.png

Segment the image with SAM API using the bounding box as a prompt#

rgb_img_uri = results["results"][2]["uri"]
data = {
    "list_images": [
        {
            "image_name": os.path.basename(rgb_img_uri),
            "image_uri": rgb_img_uri,
            "bboxes": pixel_bboxes.tolist(),
            "points": None,
            "labels": None
        },
    ]
}
response = requests.post(os.path.join(sam_api,"api/v1/prompt"),headers=headers,json=data)
print (response.text)
task_id = json.loads(response.content.decode())
{"task_id":"b55d89fe-c885-45a2-b684-c66a73a0483e"}

Monitor task status#

params = {'task_id': task_id["task_id"]}
response = requests.get(os.path.join(sam_api,"api/v1/task/status"), headers=headers, params=params)
print(response.text)
{"task_id":"b55d89fe-c885-45a2-b684-c66a73a0483e","state":"SUCCESS","result":"[{'image_uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RBN_20230607T064629_RGB.tif', 'processed': True, 'png_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.png?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=1Cc4D5GRRGYXj0rctiMz96mGWA8%3D&Expires=1719432094', 'png_result_uri': 's3://MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.png', 'tif_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=Ele%2BzVmJEW7dRaldXWVkEd5TnZs%3D&Expires=1719432094', 'tif_result_uri': 's3://MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.tif'}]"}

Get the results#

params = {'task_id': task_id["task_id"]}
response = requests.get(os.path.join(sam_api,"api/v1/prompt"), headers=headers, params=params)
results = response.json()
print (results)
{'results': [{'image_uri': 's3://MoBucket/89839370-4215-499f-9d94-18f31b27b7a9/T40RBN_20230607T064629_RGB.tif', 'processed': True, 'png_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.png?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=1Cc4D5GRRGYXj0rctiMz96mGWA8%3D&Expires=1719432094', 'png_result_uri': 's3://MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.png', 'tif_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=Ele%2BzVmJEW7dRaldXWVkEd5TnZs%3D&Expires=1719432094', 'tif_result_uri': 's3://MoBucket/b55d89fe-c885-45a2-b684-c66a73a0483e/T40RBN_20230607T064629_RGB_bbox_mask.tif'}]}

Save results to local file#

segmentation_output_path = os.path.join(data_path,"sam_segmentation")
os.makedirs(segmentation_output_path,exist_ok=True)
sam_output_tif_path = os.path.join(segmentation_output_path,"prompt_mask_bbox.tif")
save_image_from_url(results["results"][0]["tif_result_url"],sam_output_tif_path)
Image successfully saved to data/object_detection/sam_segmentation/prompt_mask_bbox.tif

Visualize the results#

rgb_img = tiff.imread(list_downloaded_imgs[2])
msk_img = tiff.imread(sam_output_tif_path)
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(rgb_img.astype("uint8"))
plt.axis('off')
plt.title('Original Image')
msk_img[msk_img==0] = np.nan
plt.subplot(1, 2, 2)
plt.imshow(rgb_img.astype("uint8"))
plt.imshow(msk_img, cmap='jet', alpha=0.5)
plt.axis('off')
plt.title('Mask Overlay')
plt.tight_layout()
plt.show()
_images/d56ad54231d81ea1655b03cd3bd48f3c69125a2964d57e687c643f86f3460a06.png

WORKFLOW 2 Object Detection using OD API and VHR Images#

Download the dataset from GDrive#

file_id = "1jfc3mPcGN3ufs3aw9LthG9SWvNQorCoK"
zip_destination = "data/object_detection/tiles.zip"
os.makedirs(os.path.dirname(zip_destination), exist_ok=True)
url = f"https://drive.google.com/uc?id={file_id}"
gdown.download(url, zip_destination, quiet=False)
with zipfile.ZipFile(zip_destination, 'r') as zip_ref:
    zip_ref.extractall(data_path)
os.remove(zip_destination)
Downloading...
From (original): https://drive.google.com/uc?id=1jfc3mPcGN3ufs3aw9LthG9SWvNQorCoK
From (redirected): https://drive.google.com/uc?id=1jfc3mPcGN3ufs3aw9LthG9SWvNQorCoK&confirm=t&uuid=912502da-bed5-4ed0-9b53-a5fff6265a5a
To: /Users/syam/Documents/code/eo4eu/igarss2024-eo4eu/docs/data/object_detection/tiles.zip
100%|██████████| 1.37G/1.37G [00:13<00:00, 97.7MB/s]

Get list of images from local directory. These images are duplicated on S3 storage to be found by the model and avoid uploading them again#

list_imgs = glob.glob(os.path.join(data_path, 'tiles/*.tif'))
len(list_imgs)
625

Visualize some examples of these images#

num_images = 9
random_indices = np.random.choice(len(list_imgs), num_images, replace=False)
random_images = [tiff.imread(list_imgs[i]) for i in random_indices]
figsize = (10, 10)
fig, axes = plt.subplots(3, 3, figsize=figsize)
fig.subplots_adjust(hspace=0.3, wspace=0.3)
axes = axes.ravel()
for i in range(num_images):
    axes[i].imshow(random_images[i], cmap='gray')
    axes[i].axis('off')
plt.tight_layout()
plt.show()
_images/a86a7e96e0af8deec3ebae0615ccf78b9eb048829c9ca403dc113ee272d57dba.png

Send request to the OD API to detect all objects that can be detected by the model#

classes = {
        "plane": 0,
        "ship": 1,
        "storage-tank": 2,
        "baseball-diamond": 3,
        "tennis-court": 4,
        "basketball-court": 5,
        "ground-track-field": 6,
        "harbor": 7,
        "bridge": 8,
        "large-vehicle": 9,
        "small-vehicle": 10,
        "helicopter": 11,
        "roundabout": 12,
        "soccer-ball-field": 13,
        "swimming-pool": 14,
        "container-crane": 15,
        "airport": 16,
        "helipad": 17
      }
list_tasks = []
img_files = [list_imgs[i] for i in random_indices]
img_files = [os.path.basename(img) for img in img_files]
data = {"list_images":[{"image_uri":os.path.join("s3://MoBucket/obj-det",img),"classes":classes} for img in img_files]}
response = requests.post(os.path.join(od_api,"api/v1/yolov8/obb/detect"),headers=headers,json=data)
task_id = json.loads(response.content.decode())
list_tasks.append(task_id["task_id"])

Monitoring the status of the tasks#

for idx, task_id in enumerate(list_tasks):
    print (idx)
    params = {'task_id': task_id}
    response = requests.get(os.path.join(od_api,"api/v1/task/status"), headers=headers, params=params)
    print(response.text)
0
{"task_id":"1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d","state":"SUCCESS","result":"[{'image_uri': 's3://MoBucket/obj-det/patch_209.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_209.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=zUcRDWkk4ocTq6zVB0RyzUxirO4%3D&Expires=1719432291', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_209.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_465.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_465.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=FL1mlKIWWMRfj6CAZbq3H%2BjiLh4%3D&Expires=1719432291', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_465.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_104.tif', 'processed': True, 'result_url': None, 'result_uri': None}, {'image_uri': 's3://MoBucket/obj-det/patch_143.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_143.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=iyaMXcyRRs2BfLgV0WuEN%2FSWX%2FE%3D&Expires=1719432292', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_143.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_176.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_176.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=P1NrEa4JJcVcgGVfb2%2F5vad0P9Q%3D&Expires=1719432293', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_176.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_598.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_598.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=z0syym9WH8hUATovh29HaFcRR9c%3D&Expires=1719432294', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_598.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_338.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_338.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=fBo%2Fb1BM%2Fl1yVePbsnRz938b0OY%3D&Expires=1719432294', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_338.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_487.tif', 'processed': True, 'result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_487.txt?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=SnxMzT1L4046Mft6%2BkFktSgQ56Y%3D&Expires=1719432295', 'result_uri': 's3://MoBucket/1e8ea75c-a0f1-4865-8cbf-b2ccb07a8a3d/patch_487.txt'}, {'image_uri': 's3://MoBucket/obj-det/patch_33.tif', 'processed': True, 'result_url': None, 'result_uri': None}]"}

Save results to local files#

bbox_output_path = "data/object_detection/vhr_bboxes"
os.makedirs(bbox_output_path,exist_ok=True)
for task in list_tasks:
    params = {'task_id': task}
    response = requests.get(os.path.join(od_api,"api/v1/yolov8/obb/detect"), headers=headers, params=params)
    results = response.json()
    results = results["results"]
    for res in results:
        result_url = res["result_url"]
        if result_url is not None:
            fname = get_filename_from_url(result_url)
            output_fname = os.path.join(bbox_output_path,fname)
            download_txtfile_from_url(result_url,output_fname)
File downloaded successfully as data/object_detection/vhr_bboxes/patch_209.txt
File downloaded successfully as data/object_detection/vhr_bboxes/patch_465.txt
File downloaded successfully as data/object_detection/vhr_bboxes/patch_143.txt
File downloaded successfully as data/object_detection/vhr_bboxes/patch_176.txt
File downloaded successfully as data/object_detection/vhr_bboxes/patch_598.txt
File downloaded successfully as data/object_detection/vhr_bboxes/patch_338.txt
File downloaded successfully as data/object_detection/vhr_bboxes/patch_487.txt

Visualize the results using Dash Plotly if you are running this notebook on your machine#

import colorsys
import io
import dash
from dash import dcc, html
import dash_daq as daq
from dash.dependencies import Input, Output
import os
import cv2
import base64
from PIL import Image
import numpy as np

CATEGORIES = {
    "plane": 0,
    "ship": 1,
    "storage-tank": 2,
    "baseball-diamond": 3,
    "tennis-court": 4,
    "basketball-court": 5,
    "ground-track-field": 6,
    "harbor": 7,
    "bridge": 8,
    "large-vehicle": 9,
    "small-vehicle": 10,
    "helicopter": 11,
    "roundabout": 12,
    "soccer-ball-field": 13,
    "swimming-pool": 14,
    "container-crane": 15,
    "airport": 16,
    "helipad": 17,
}

app = dash.Dash(__name__)

image_directory = "data/object_detection/tiles"

annotation_directory = "data/object_detection/vhr_bboxes"

def generate_class_colors(num_classes):
    hsv_colors = [(x * 1.0 / num_classes, 1.0, 1.0) for x in range(num_classes)]
    rgb_colors = [
        tuple(int(255 * y) for y in colorsys.hsv_to_rgb(*color)) for color in hsv_colors
    ]
    return rgb_colors

def get_image_options():
    images = []
    annotations_paths = []
    for filename in os.listdir(image_directory):
        if filename.endswith((".png", ".jpg", ".jpeg",".tif")):
            annotation_filename = os.path.splitext(filename)[0] + ".txt"
            annotation_path = os.path.join(annotation_directory, annotation_filename)
            if os.path.exists(annotation_path):
                images.append(filename)
                annotations_paths.append(annotation_path)
    return [
        {"label": image, "value": annotation_path}
        for image, annotation_path in zip(images, annotations_paths)
    ]


num_classes = len(CATEGORIES)

class_colors = generate_class_colors(num_classes)

app.layout = html.Div(
    [
        html.H1("Image Annotation Viewer", style={"text-align": "center"}),
        dcc.Dropdown(
            id="image-dropdown",
            options=get_image_options(),
            multi=False,
            value=get_image_options()[0]["value"],
            style={"width": "50%", "margin": "auto", "margin-top": "20px"},
        ),
        html.H3("Toggle annotation"),
        html.Div(
            daq.ToggleSwitch(id="annotation-toggle", value=True),
            style={"width": "50%", "margin": "auto", "margin-top": "20px"},
        ),
        html.H3("Toggle label"),
        html.Div(
            daq.ToggleSwitch(id="label-toggle", value=True),
            style={"width": "50%", "margin": "auto", "margin-top": "20px"},
        ),
        html.Div(
            [
                html.Img(
                    id="image-display", style={"width": "100%", "margin-top": "20px"}
                )
            ],
            style={"text-align": "center"},
        ),
    ],
    style={"font-family": "Arial", "background-color": "#f5f5f5", "padding": "20px"},
)

@app.callback(
    Output("image-display", "src"),
    Input("image-dropdown", "value"),
    Input("annotation-toggle", "value"),
    Input("label-toggle", "value"),
)
def update_image(selected_annotation_path, show_annotation, show_label):
    if selected_annotation_path is None:
        return
    image_filename = (
        os.path.basename(selected_annotation_path).split(".txt")[0] + ".tif"
    )
    image_path = os.path.join(image_directory, image_filename)

    img = Image.open(image_path)
    img_np = np.array(img)
    if show_annotation:
        try:
            annotations = read_annotation_file(selected_annotation_path)
            for annotation in annotations:
                class_id = annotation["class_id"]
                class_name = [
                    name for name, id in CATEGORIES.items() if id == class_id
                ][0]
                coordinates = np.array(annotation["coordinates"]).reshape(-1, 2)
                coordinates = coordinates.astype(int)

                color = class_colors[class_id]

                img_np = cv2.polylines(
                    img_np, [coordinates], isClosed=True, color=color, thickness=2
                )

                if show_label:
                    label_position = (
                        int(coordinates[0, 0]),
                        int(coordinates[0, 1] - 5),
                    )
                    cv2.putText(
                        img_np,
                        class_name,
                        label_position,
                        cv2.FONT_HERSHEY_SIMPLEX,
                        0.5,
                        color,
                        2,
                    )

        except OSError as e:
            pass

    img_pil = Image.fromarray(img_np)
    img_byte_array = io.BytesIO()
    img_pil.save(img_byte_array, format="PNG")
    img_base64 = base64.b64encode(img_byte_array.getvalue()).decode("utf-8")
    img_src = f"data:image/png;base64,{img_base64}"
    return img_src


if __name__ == "__main__":
    app.run()

If you are running the notebook on Pangeo jupyterhub, use matplotlib for plotting to show the images#

import colorsys
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

CATEGORIES = {
    "plane": 0,
    "ship": 1,
    "storage-tank": 2,
    "baseball-diamond": 3,
    "tennis-court": 4,
    "basketball-court": 5,
    "ground-track-field": 6,
    "harbor": 7,
    "bridge": 8,
    "large-vehicle": 9,
    "small-vehicle": 10,
    "helicopter": 11,
    "roundabout": 12,
    "soccer-ball-field": 13,
    "swimming-pool": 14,
    "container-crane": 15,
    "airport": 16,
    "helipad": 17,
}

image_directory = "data/object_detection/tiles"
annotation_directory = "data/object_detection/vhr_bboxes"

def generate_class_colors(num_classes):
    hsv_colors = [(x * 1.0 / num_classes, 1.0, 1.0) for x in range(num_classes)]
    rgb_colors = [
        tuple(int(255 * y) for y in colorsys.hsv_to_rgb(*color)) for color in hsv_colors
    ]
    return rgb_colors


def draw_annotations(image_path, annotation_path, show_annotation=True, show_label=True):
    img = Image.open(image_path)
    img_np = np.array(img)
    fig, ax = plt.subplots(1)
    ax.imshow(img_np)

    if show_annotation:
        annotations = read_annotation_file(annotation_path)
        class_colors = generate_class_colors(len(CATEGORIES))

        for annotation in annotations:
            class_id = annotation['class_id']
            class_name = [name for name, id in CATEGORIES.items() if id == class_id][0]
            
            coordinates = np.array(annotation['coordinates']).reshape(-1, 2)
            color = class_colors[class_id]
            color = tuple(c/255.0 for c in color)
            polygon = plt.Polygon(coordinates, edgecolor=color, fill=None, linewidth=2)
            ax.add_patch(polygon)

            if show_label:
                label_position = coordinates[0]
                ax.text(label_position[0], label_position[1], class_name, color=color, fontsize=12, weight='bold')

    plt.axis('off')
    plt.show()


# Main function to process all images
def process_images():
    for filename in os.listdir(annotation_directory):
        if filename.endswith(".txt"):
            annotation_path = os.path.join(annotation_directory, filename)
            if os.path.exists(annotation_path):
                image_path = os.path.join(image_directory, os.path.basename(filename).split(".")[0]+".tif")
                draw_annotations(image_path, annotation_path,show_label=False)


if __name__ == '__main__':
    process_images()
_images/89d07d26e4889b868b4ddfcc9f77910852f728cbc41fd6dce33fb220d8e2082b.png _images/43fad3dac75a3fa535171a6027d021e244d465c0117d1a3ea918b45fc33219b2.png _images/9b87c66fa8254cde153abce64e5e21f7327692aee035fdc93d98ff3769f4d3ea.png _images/4d49005eeb5335553978f2622d0e22cbe0953cf39de2203c4c5e86c6ff1dc712.png _images/36bb77f1e54790cacc1a8da4eaf54704d696a19e2dc8e1f63acd2023769f7d5f.png _images/04e99603f042d1836702dd04d0a09e27530e1750d273a6cb0861303c1cb5f3ac.png _images/336d8246f52e2c7adbd7420732b207e1aaa13d271abb7848d488c7575e5e64aa.png _images/f04e7545f6f935802f4e94f83be8f74fdcd47a472f6eef7b654d6c85ac42996f.png _images/c01430e3ef55bbf1cdef189ed056946d0ce2f18808143d544c28a48a79484015.png _images/26029e5dea704e38021a9859f74eff41fa320ec70d13768e09b067f640288976.png _images/7fc7e86f3e689007e197b758682dd857353e4f8b129175893c0bc38b261df44a.png _images/9b8f02985ab830ee3ea014f7f632bb65b292c041aee30884c0eb9f52b1cf43fd.png _images/8edc695c8935d293e05a61e0e55321745c424b600a056f66f34b2ac8430291db.png _images/7bb5f3254e09dcfce4053283b1d61ef428c6faf458de2116acaf96d35d5fccf6.png _images/9b223e3d047f15148a9d6a614615baf5a1e1a74d3d8fa2ba18562844ad4b911c.png _images/0bc97a3b8131a85046bfcb7cb5dc4980eff969590a42217064877ec4aee744a9.png _images/5e70cccae874ee51c7c6d73538b13e8d8c275bfdc499bcfeea50dba25abfd74d.png _images/e7418c1d76fc5ebfa79bcbe3a4899090dec72ff41a2f58bd34bc597312a4f765.png _images/bcee27944abd44b7383e6d23ad5d9568e3f2abfded0ba420a7936bf608b91219.png _images/5b2523745545ef005ef25ed0f2f599b2f1958e6f295840c5ce722e63f77787c1.png _images/e37a7b95ace16be820681df0040f533cd2b0598258e4f48f98ce02c460e0628f.png _images/65bcf8fa8b6269544c0890e40c3186bfd2214502c653162b0218cf1013bba0e9.png _images/e52f9310be13b205218fe31d19ebbffb556d4d36a1bbe44c9a6eeea9687998b7.png _images/86f20127ad32fe5cdce558bd45dd37e6c82e1356fa9613eaf11ea3395a1f3c17.png _images/5207e25d9d23b4bf3333882b36ae21417ee8869ed16d244af9d22996f7ae4273.png _images/73992bf99b29d4bcb403df8bfc7e9b8e8be45f231b6d8f5b81a8644c26221039.png _images/d8b1145cceb8376c5305ea5e88feaabc9a2e291ff49c10dfc13ba2cb80e7caf3.png _images/7a6130133702598b6f605ba0409ff4766ccd51c560b62f571726c03a1f3ffe9b.png _images/16735f1bad772c18f96e05aa5370d71b4cc1039ba48dfb0fcd954b9b2b32171e.png

WORKFLOW3 Object Detection and Segmentation using OD and SAM APIs on VHR Images#

Read annotation from one example, convert it from OBB to AABB and and use the detected bounding boxes as prompts for SAM API to segment objects#

img_files
['patch_209.tif',
 'patch_465.tif',
 'patch_104.tif',
 'patch_143.tif',
 'patch_176.tif',
 'patch_598.tif',
 'patch_338.tif',
 'patch_487.tif',
 'patch_33.tif']
example_filename = img_files[4]
example_tif = os.path.join("data/object_detection/tiles",example_filename)
example_obb = os.path.join("data/object_detection/vhr_bboxes",example_filename.split(".")[0]+".txt")
example_tif_s3 = os.path.join("s3://MoBucket/obj-det",example_filename)

annotations = read_annotation_file(example_obb)
img_tif = tiff.imread(example_tif)

Convert OBB to ABB and visualize for verification#

import math
fig, ax = plt.subplots(1)

# Display the image
ax.imshow(img_tif.astype("uint8"))

# Add the bounding boxes
list_aabb_bbox = []
for box in annotations:
    obb = box["coordinates"]
    x_min,y_min,x_max,y_max = obb_to_aabb(obb)
    x_min,y_min,x_max,y_max = math.ceil(x_min),math.ceil(y_min),math.ceil(x_max),math.ceil(y_max)
    list_aabb_bbox.append([x_min,y_min,x_max,y_max])
    width = x_max - x_min
    height = y_max - y_min
    rect = patches.Rectangle((x_min, y_min), width, height, linewidth=2, edgecolor='r', facecolor='none')
    ax.add_patch(rect)

# Display the plot
plt.show()
_images/029cb23123b252d2d1d6e55e64f32878c9b35ce9a17fcd846d659621a51912fa.png

Send request with the bbox prompts to SAM API#

data = {
    "list_images": [
        {
            "image_name": os.path.basename(example_tif_s3),
            "image_uri": example_tif_s3,
            "bboxes": list_aabb_bbox,
            "points": None,
            "labels": None
        },
    ]
}
response = requests.post(os.path.join(sam_api,"api/v1/prompt"),headers=headers,json=data)
print (response.text)
task_id = json.loads(response.content.decode())
{"task_id":"a924317a-7668-4e95-bac0-53c76b77a70c"}

Monitor the task status#

params = {'task_id': task_id["task_id"]}
response = requests.get(os.path.join(sam_api,"api/v1/task/status"), headers=headers, params=params)
print(response.text)
{"task_id":"a924317a-7668-4e95-bac0-53c76b77a70c","state":"SUCCESS","result":"[{'image_uri': 's3://MoBucket/obj-det/patch_176.tif', 'processed': True, 'png_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.png?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=3Y%2FuexSNS17P0UFWrAj4WHo0y4A%3D&Expires=1719432557', 'png_result_uri': 's3://MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.png', 'tif_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=%2BVYPqd7xSxUe%2BXeTXIn34PXB7JI%3D&Expires=1719432557', 'tif_result_uri': 's3://MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.tif'}]"}

Get results#

params = {'task_id': task_id["task_id"]}
response = requests.get(os.path.join(sam_api,"api/v1/prompt"), headers=headers, params=params)
results = response.json()
print (results)
{'results': [{'image_uri': 's3://MoBucket/obj-det/patch_176.tif', 'processed': True, 'png_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.png?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=3Y%2FuexSNS17P0UFWrAj4WHo0y4A%3D&Expires=1719432557', 'png_result_uri': 's3://MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.png', 'tif_result_url': 'https://object-store.os-api.cci1.ecmwf.int/MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.tif?AWSAccessKeyId=e850aff0dd5749a0a8df9f909014049c&Signature=%2BVYPqd7xSxUe%2BXeTXIn34PXB7JI%3D&Expires=1719432557', 'tif_result_uri': 's3://MoBucket/a924317a-7668-4e95-bac0-53c76b77a70c/patch_176_bbox_mask.tif'}]}

Save results to local file#

sam_output_tif_path = os.path.join(segmentation_output_path,"prompt_mask_patch_214.tif")
save_image_from_url(results["results"][0]["tif_result_url"],sam_output_tif_path)
Image successfully saved to data/object_detection/sam_segmentation/prompt_mask_patch_214.tif

Visualize results#

msk_img = tiff.imread(sam_output_tif_path)
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(img_tif.astype("uint8"))
plt.axis('off')
plt.title('Original Image')
msk_img[msk_img==0] = np.nan
plt.subplot(1, 2, 2)
plt.imshow(img_tif.astype("uint8"))
plt.imshow(msk_img, cmap='jet', alpha=0.5)
plt.axis('off')
plt.title('Mask Overlay')
plt.tight_layout()
plt.show()
_images/688b2b3cf358464c3cd4aff6183dc7fd616f8d4a15435c447dfa3632ad162571.png