The Bear Necessity of AI in Conservation
The AI for Bears Challenge results which aims to improve the monitoring and identification of bears using advanced computer vision techniques.
At the birth of the AI for Greener Cities Challenge, there was a fleet of delivery vehicles driving around European cities. DPD, its owner, intended to gather sensor data to plan better delivery routes that avoid possible traffic congestion. The initial goal was to save time and reduce carbon dioxide emissions.
But maybe, there was more use to sensor-equipped cars. For example, why not create helpful open source solutions for cities to become more sustainable?
DPD Netherlands teamed up with Jheronimus Academy of Data Science and FruitPunch AI to assemble a team of data scientists and AI enthusiasts to come up with possible useful applications; and have some fun with machine learning while doing it.
We, the AI for Greener Cities engineers, accepted the Challenge. Our team was split into 2 groups focusing on the most promising avenues: traffic detection and vegetation monitoring.
Team BusAround took on the goal to detect the traffic density caused by buses and other heavy vehicles and:
“With a background in business strategy I’ve learnt the importance of focus, focus, focus. During this 10-week data science project I discovered its equivalent: scoping, scoping, scoping. We had high ambitions, but realized early on that the given time and resources were limited. The discussion turned from “what is interesting to do?” to “what can we do?” *Antoine Miltenburg, *AI for Greener Cities Engineer
The aim was to define a model to indicate the ratio between buses and private vehicles, between heavy and light vehicles, and traffic density in any given street. This model should work with data from video recordings from the dashcams or other cameras installed on the DPD delivery vans and other vehicles (for example a garbage collection truck) with similar sensor set-up. This is schematically shown here: Our intended result of the proof of concept was a data visualization pipeline. This should provide insights on how the model’s output was used to understand the data. Then some insights from the data could be generated using a deep learning model.
The team used the Berkley Driving Dataset: BDD100K to develop a first model. Training with this dataset offers a good first approximation of real-life scenarios for the future model for several reasons:
For the scope of this Challenge, the videos recorded in downtown New York have been selected.
Next step was the model selection. The choice for a model, already pre-trained on the BDD100K dataset, was based on testing three alternatives:
We decided to go with YOLOv5 and train it on more than the BDD100K, we added the COCO. YOLOv5 trained on this COCO dataset gave good results indicating buses. We decided to pitch the YOLOv5 model against YOLOv7. Version 5 performed better: Due to time constraints, we made use of available models without fine tuning. The outcome of this project is therefore not ready for real world implementation, yet. What it does is give direction to where data science could help improve the efficiency of driving routes.
Our team believes that using observations from video recordings will contribute to better data about traffic congestion, availability of public transport (and shared mobility) in certain streets at certain times of the day.
Team Green Vegetation went on to develop a computer vision based automated urban re-greening suggestion system. The system is able to analyze video footage from cameras mounted on trucks; giving insights into places that need more plants and trees via three metrics to score the city on its greenness.
We looked at data and models that have previously been created to get a better understanding of what needed to be done. Our workflow was planned around achieving one task per week, with weekly meetings for everyone to rendezvous and troubleshoot any issues as well as setup the tasks for the following week.
Our goal was to create and compare several models that can be used to estimate two things:
Since driving data mostly consisted of urban areas images, potential green areas for cityscapes were identified using three metrics:
NDVI is a metric typically used in satellite imagery to measure the amount of greenery in an image. It does so by computing the number of green pixels in an image compared to the amount of non-green pixels. NVDI is typically used in the ecological field to compute greenery over large spaces in real time. Although NDVI is widely used and useful, it is often only used on satellite images. We needed to apply another metric to street view images.
GVI is used to compute the amount of greenery in each image. It can be applied to output driving datasets. Although less known, it was a bit more suited to our project compared to NDVI. GVI computes the ratio of green vs non-green pixels on an image. GVI was one of the main metrics used in the project. NDVI was used to verify the GVI results.
The last metrics we used was VSD. This showed us the proportion of any vegetation type relative to other vegetation (grasses to shrubs, etc.). VSD is incredibly useful since some plants can absorb more CO2 than others, such as C3 trees compared to C4 shrubs and grasses. This means VSD can be used to maximize the amount of CO2 absorbed by green spaces by showing areas where more carbon efficient vegetation can be planted.
“One of our goals was to identify spots where greenery could be increased. We needed to understand which objects in an image are essential, which are already plants, and which can be removed to make green spaces.” Qiulin Li, AI for Greener Cities engineer
The first step was to use a segmentation model to identify objects such as roads and trees. We used three segmentation models to sort objects in an image into various categories. The models were trained on the BDD100K and Cityscapes Dataset, both of which are open source.
We picked 3 segmentation models to train.
The most important suggestion we have for the future users of these models is to measure urban greenery using all three metrics - GVI, VSD and NDVI. The indicators combined in a calculation with different weights can give a comprehensive greenery quality score. It is essential for municipalities to know the actual balance of the city’s vegetation to take actions.
The AI for Greener Cities Challenge has been an educational and impactful experience for all of us. With the literature and practical activities, we deepened our AI knowledge and improved our teamwork skills since we worked online in teams from four different continents.
Antoine Miltenburg, Jari Gabriëls, Qiulin Li
AI for Greener Cities Engineers
*Team Bus Around *Sahil Chachra, Resham Sundar, Shubham Baid, Giuseppina Schiavone, Luca Simonetti, Antoine Miltenburg
*Team Green Vegetation *Jari Gabriëls, Qiulin Li, Ha Trinh, Claudia Flores-Saviaga, Bruhanth Mallik, Animesh Maheshwari, Alexandre Capt