discussion / Remote Sensing & GIS  / 3 March 2023

Google Earth Engine vs Microsoft's Planetary Computer: Which do I use? 

It was an utter delight to have Dr. Gilberto Câmara along to our Variety Hour show last week, he was an incredibly engaging and knowledgeable presenter and ended up staying back to continue the conversation for half an hour beyond the end of the call.

I invited Gliberto along as our main presenter after I stumbled upon this thread he posted on twitter: 'Google Earth Engine (GEE) or Microsoft Planetary Computer (MPC): important cloud services for Earth observation (EO) data analysis. What are the strong points of each? What are their weaknesses? Follow the thread....'

It was the first time I'd seen someone explain the difference between the two platforms, and thought it might be a useful topic for other people in our community wondering what the Planetary computer actually is and how it compares to existing platforms. The twitter thread is a great start, and Gilberto expands on it in his Variety Hour talk: 

 

If you have any questions about the talk, please drop them below and we'll get gilberto to pop in and answer you! 

Steph 




Thanks a lot for sharing this recording! And a big thank you to Dr. Gilberto Câmara for explaining the differences between the two platforms. I also very much applaud his words on language, how different landscapes show up the same or different on different images/sensors, and pointing out the importance of taking the time dimension into account.

One question : Local knowledge is essential, but Dr. Câmara talks mostly about the big pictures, implicitly saying that these are essential as well. How does he see the relation(s) between the two?

At @StephODonnell , yes, can we please invite Dr. Câmara for the two hour session? The example of the savanna and the pasture made my heart skip a beat. So I am very interested in analytical and conservation-policy making pitfalls in using satellite data ( i.e. big data ), and he seems to know exactly where they are. 

 

Dear @Frank_van_der_Most and @StephODonnell , thanks for the comments. Regarding the importance of local knowledge in EO data classification, some thoughts follow:

1. Consider two AI applications: large language models (LLM) and object recognition in images.  LLMs such as ChatGPT use words to predict the next word. Since language is its own meta-language, LLMs rely on the fact that our understanding of written text is direct. There are no intermediaries between humans and the printed page. 

2. Object recognition in images (e.g., face recognition) is another kind of AI application where there is an implicit assumption: there are objects (faces, cars, etc) in the image and the role of the algorithm is to distinguish them from the background (considered as unwanted noise).

3. Classification and interpretation of Earth observation data, by contrast, uses a different paradigm. In principle, all of the data is informative. Unlike face recognition, there is no background. Every pixel counts. Pixel values are not words, but  measures of reflections, emissions or echoes of the Earth's surface. 

4. We use words to describe the reality external to us. The variety of nature is such that we have to use simplifications and taxonomies to describe our landscapes. Take the word "forest". As Chazdon et al. question in their 2016 paper, "when is a forest a forest?" The answer is: it depends on who is asking the question. 

5. There have been many attempts to join pixel values with landscape descriptions. e.g, "pixels with NDVI > 0.75 are forests". Do they? What about dry forests that only have high NDVI values in the wet season? So far, all attempts to use direct links between pixel values and landscapes have failed the test of rigour. 

6. Another example is the algorithm used by Global Forest Watch to measure tree cover gain and loss. As explained in the link below, "Not all tree cover is a forest". As GFW acknowledges, their algorithm has problems distinguishing forest from oil-palm plantation and to identify trees in dry forests (see more at  https://research.wri.org/gfr/data-methods?utm_campaign=treecoverloss2021&Limitations#limitations). 

7. Some of you may know the attempt made by FAO to standardize land use and land cover classification using the LCCS ontology. LCCS describes land properties based only on land cover types, disregarding land use. For example, LCCS does not distinguish ‘pasture’ from ‘natural grasslands’; it labels both as herbaceous land cover types. Classification in LCCS has no temporal reference. For a more detailed criticism, see Camara (2020). 

8.  There is no shortage of global land cover and land use maps. While these maps provide a general sense of the global picture, very few (if any) have local significance. As those in the WILDLABS community know, local context matters. My favourite example is the Brazilian Cerrado, an endangered biodiversity hotspot. In the last decades, many areas of natural vegetation in the Cerrado have been converted to pasture for cattle raising. However, global maps inevitably label both pastures and natural Cerrado vegetation as "grasslands". Clearly, such data is hardly usable for supporting studies and public policies in the Cerrado.

9. What is the alternative for mapping areas such as the Cerrado? The only way I see is gathering experts who understand the uniqueness of each ecosystem and try to relate each landscape to signals measured by EO satellites. This is hard and painstaking work, which many iterations. 

10. The recent availability of open big EO data is a blessing and a curse. Using time series, experts can use the temporal evolution of the pixel values to improve the discriminatory power of EO data. Take the distinction between herbaceous pasture and natural Cerrado vegetation. All savannas of the planet (including the Cerrado) have evolved to be resilient to the dry season and to fire. Therefore, while in the wet season it is sometimes difficult to distinguish between herbaceous pasture and natural Cerrado, such distinction increases in the dry season. This is a case of where time series and big data improve the classification results.

11. Big EO data is also a curse, since it requires experts to rethink how to use EO data for land classification. Selecting training samples by looking at a single image is too simplistic when we are classifying time series. Linking the values of a time series to the temporal evolution of the landscapes requires relearning what EO data is. 

Long story short: using Earth observation for conservation studies and public policy making is hard. It requires the combination of big EO data, good algorithms, and lots of expertise to understand the information inherent in the data. A nice challenge to all!

References cited:

Chazdon et al., "When is a forest a forest? Forest concepts and definitions in the era of forest and landscape restoration". Ambio, 45, p 538–550 (2016).

Camara, "On the semantics of big Earth observation data for land classification". Journal of Spatial Information Science, 20 (2020).