IV.4.4 Classification

Summary

IV - FROM DATA TO INFORMATION

 


4- HOW CAN WE ANALYSE IMAGES AND PRODUCE MAPS?

4.4- How to classify images/data?

Objects of similar nature usually also have similar image characteristics. Consider, for example, vegetation that reflects strongly in the infrared spectrum, residential houses that usually have distinctly different shape characteristics than factory buildings, or agricultural plots that exhibit different texture characteristics than forests.

Image features have long been used for visual interpretation of aerial photographs and satellite images. The image analyst then groups objects in the image that exhibit similar characteristics (see How can an image be visually analysed?). Thus, the image is divided into several sub-areas, each of which is assigned a particular category. Each image pixel is "classified" as it were, i.e. assigned to a particular class.

The figure above shows part of a satellite image captured by the Thematic Mapper sensor of the Landsat 4 satellite in the spring. The area shown is the region around the Platte-Taille reservoir and L'Eau d'Heure lakes near Philippeville. The image is displayed as a false colour composite where the infrared, red and green spectral channels are assigned to the red, green and blue screen colours, respectively. Based on differences in colour, intensity, shapes and textures, we can visually distinguish a number of classes. For instance, the lakes of Eau d'Heure are very dark blue (almost black), the roads and inhabited areas are quite dark grey-blue, the deciduous forests, which are dominant in the region, are dark red and the few spots with coniferous trees appear darker brown.

Since digital images are nothing but matrices of numbers, we can harness the computational power of computers to analyse and interpret them.

The first computer algorithms developed to create image classifications relied solely on spectral characteristics of individual pixels. Indeed, the image sensor on board aircraft or satellites records the so-called spectral signature of objects on the Earth's surface. Objects of similar nature usually have similar spectral properties. This means that the electromagnetic radiation reflected from objects of the same type is broadly similar and therefore these objects will have close spectral signatures.

We can distinguish two main groups of pixel-based classification algorithms. A first kind, the so-called "supervised" classification, will compare the spectral values of a pixel with "reference signatures" of a number of predefined classes and assign the pixel to the class with which the spectral features best match. Thus, to apply such algorithm, we need to determine in advance which spectral classes we want to distinguish and know what their spectral characteristics are.

Another approach is the so-called "unsupervised" algorithm. This does not work with predetermined classes but performs an automatic grouping or "clustering" of the spectral values of pixels. Pixels with similar values are then placed in the same group.  It is then up to the user to interpret these groups after classification and relate them to the ground cover.

The figure above shows the classification of a high-resolution satellite image of a Part of Dublin (Phoenix Park). The algorithm used for classification classifies individual pixels into predefined classes based only on spectral features.

Although spectral signatures can already tell us a lot about the nature of objects, classification algorithms that only look at the spectral values of individual pixels have a lot of drawbacks. After all, satellites only perceive the upper part of the outer shell of objects from orbit. For example, the satellite "sees" only the roof of a house.

Moreover, the spectral properties of each pixel are considered individually without considering the shape of the objects or what is happening in the environment of the pixel. At best, such methods then allow us to classify buildings based solely on the spectral characteristics of the roofing materials while many users are more interested in classifying them based on the function of the building (residential, commercial, industrial, etc.). This limitation is especially detrimental for identifying artificial elements of the landscape. The analysis of spectral signatures of objects is much more relevant when it comes to analysing soils, crops, natural vegetation, etc.

It is important to take this basic principle into account when defining the categories of a classification based only on spectral features. Indeed, these categories should always be based on land cover types rather than land use patterns. A car park and the asphalt roof of a building constitute different land use patterns, but they have spectral signatures so close together that they are likely to be classified in the same category.

More recent classification algorithms make use of a much wider arsenal of image characteristics than just the spectral values of individual pixels. The rise of machine learning and artificial intelligence has made it possible to extract much more information from a digital image.

For instance, convolutional neural networks can identify certain objects in a landscape such as buildings or archaeological remains thanks to shape features they learn themselves from a large number of examples, even if these objects do not have unambiguous spectral properties. Such algorithms are also capable of segmenting or dividing an image into areas of pixels belonging to the same category (e.g. streets, houses, trees, etc.).

Advanced computer algorithms allow us to detect the ancient stone mounds scattered in the vast Altai region (Russia) from space on high-resolution satellite images (here Gaofen-2). These algorithms use artificial intelligence (so-called "convolutional neural networks") to recognise the typical circular shapes of these archaeological remains in the image. Source: Fen Chen et al, Automatic detection of burial mounds (kurgans) in the Altai Mountains. ISPRS Journal of Photogrammetry and Remote Sensing, Volume 177,2021, p. 217-237, ISSN 0924-2716.