RPCA-based techniques for pattern extraction, hotspot identification and signal correction using data from a dense network of low-cost NO2 sensors in London

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 10,9 MB, PDF-dokument

High-density low-cost air quality sensor networks are a promising technology to monitor air quality at high temporal and spatial resolution. However the collected data is high-dimensional and it is not always clear how to best leverage this information, particularly given the lower data quality coming from the sensors. Here we report on the use of robust Principal Component Analysis (RPCA) using nitrogen dioxide data obtained from a recently deployed dense network of 225 air pollution monitoring nodes based on low-cost sensors in the Borough of Camden in London. RPCA addresses the brittleness of singular value decomposition towards outliers by using a decomposition of the data into low-rank and sparse contributions, with the latter containing outliers. The modal decomposition enabled by RPCA identifies major periodic patterns including spatial and temporal bias, dominant spatial variance, and north-south bias. The five most descriptive components capture 98 % of the data's variance, achieving a compression by a factor of 1500. We present a new technique that uses the sparse part of the data to identify hotspots. The data indicates that at the locations of the top 15 % most susceptible nodes in the network, the model identifies 23 % more hotspots than in all other locations combined. Moreover, the median hotspot event at these at-risk locations exceeds the mean NO2concentration by 33μg/m3. We show the potential of RPCA for signal correction; it corrects random errors yielding a reference signal with R2>0.8. Moreover, RPCA successfully reconstructs missing data from a sensor with R2=0.72 from the rest of the sensor network, an improvement upon PCA of around 50 %, allowing air quality estimations even if a sensor is out of use temporarily.

OriginalsprogEngelsk
Artikelnummer171522
TidsskriftScience of the Total Environment
Vol/bind925
Antal sider11
ISSN0048-9697
DOI
StatusUdgivet - 2024

Bibliografisk note

Publisher Copyright:
© 2024 The Authors

ID: 389079852