Contenuto in:
Capitolo

Unsupervised spatial data mining for the development of future scenarios: a Covid-19 application

  • Yuri Calleo
  • Simone Di Zio

In the context of Futures Studies, the scenario development process permits to make assumptions on what the futures can be in order to support better today decisions. In the initial stages of the scenario building (Framing and Scanning phases), the process requires much time and efforts to scanning data and information (reading of documents, literature review and consultation of experts) to understand more about the object of the foresight study. The daily use of social networks causes an exponential increase of data and for this reason here we deal with the problem of speeding up and optimizing the Scanning phase by applying a new combined method based on the analysis of tweets with the use of unsupervised classification models, text-mining and spatial data mining techniques. For the purpose of having a qualitative overview, we applied the bag-of-words model and a Sentiment Analysis with the Afinn and Vader algorithms. Then, in order to extrapolate the influence factors, and the relevant key factors (Kayser and Blind, 2017; 2020) the Latent Dirichlet Allocation (LDA) was used (Tong and Zhang, 2016). Furthermore, to acquire also spatial information we used spatial data mining technique to extract georeferenced data from which it was possible to analyse and obtain a geographic analysis of the data. To showcase our method, we provide an example using Covid-19 tweets (Uhl and Schiebel, 2017), upon which 5 topics and 6 key factors have been extracted. In the last instance, for each influence factor, a cartogram was created through the relative frequencies in order to have a spatial distribution of the users discussing each particular topic. The results fully answer the research objectives and the model used could be a new approach that can offer benefits in the scenario developments process.

  • Keywords:
  • text-mining,
  • spatial analysis,
  • scenario development,
  • georeferenced textual data,
  • covid-19,
+ Mostra di più

Yuri Calleo

University of Chieti-Pescara G. D'Annunzio, Italy - ORCID: 0000-0002-0190-6061

Simone Di Zio

University of Chieti-Pescara G. D'Annunzio, Italy - ORCID: 0000-0002-9139-1451

  1. Atenstaedt, R. (2012). Word cloud analysis of the BJGP. British Journal of General Practice, 62(596), pp. 148-148.
  2. Bishop P., Hines A., Collins T. (2007). The current state of scenario development: An overview of techniques, Foresight, 9(1), pp. 5–25.
  3. Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis lectures on human language technologies, 10(1), pp. 1-309.
  4. Haining, R.P. (2010). The nature of georeferenced data. Handbook of applied spatial analysis. Springer, Berlin, Heidelberg, pp. 197-217.
  5. Hines A., Bishop P., (2015). Thinking about the Future: Guidelines for Strategic Foresight, 2nd Edition, Hinesight Edition, Huston (TX).
  6. Huang, F., Zhang, X., Zhao, Z., Xu, J., & Li, Z. (2019). Image–text sentiment analysis via deep multimodal attentive fusion. Knowledge-Based Systems, 167, pp. 26-37.
  7. Hutto, C., & Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1).
  8. Kayser, V., & Blind, K. (2017). Extending the knowledge base of foresight: The contribution of text mining. Technological Forecasting and Social Change, 116, pp. 208-215.
  9. Kayser, V., & Shala, E. (2020). Scenario development using web mining for outlining technology futures. Technological Forecasting and Social Change, 156, 120086.
  10. Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), pp. 1-167.
  11. Mayor, E., & Bietti, L. M. (2021). Twitter, time and emotions. Royal Society open science, 8(5), 201900.
  12. Minka, T. (2000). Estimating a Dirichlet Distribution. MIT Technical Report, Cambridge, (US).
  13. Narasamma, V. L., Sreedevi, M., & Kumar, G. V. (2021). Tweet Data Analysis on COVID-19 Outbreak. Smart Technologies in Data Science and Communication, Springer, pp. 183-193.
  14. Pang, B., & Lee, L. (2008). Using very simple statistics for review search: An exploration. In Coling 2008. Companion volume: Posters, pp. 75-78.
  15. Poria, S., Cambria, E., Winterstein, G., & Huang, G. B. (2014). Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems, 69, pp. 45-63.
  16. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), pp. 267-307.
  17. Tan, M. J., & Guan, C. (2021). Are people happier in locations of high property value? Spatial temporal analytics of activity frequency, public sentiment and housing price using twitter data. Applied Geography, 132, 102474.
  18. Tong, Z. and Zhang, H., (2016). May. A text mining research based on LDA topic modelling. In International Conference on Computer Science, Engineering and Information Technology, pp. 201-210.
  19. Uhl, A., Kolleck, N. and Schiebel, E., (2017). Twitter data analysis as contribution to strategic foresight-The case of the EU Research Project “Foresight and Modelling for European Health Policy and Regulations” (FRESHER). European Journal of Futures Research, 5(1), pp.1-16.
  20. Wang, X., & Grimson, E. (2007). Spatial Latent Dirichlet Allocation. NIPS, 20, pp. 1577-1584.
PDF
  • Anno di pubblicazione: 2021
  • Pagine: 173-178

XML
  • Anno di pubblicazione: 2021

Informazioni sul capitolo

Titolo del capitolo

Unsupervised spatial data mining for the development of future scenarios: a Covid-19 application

Autori

Yuri Calleo, Simone Di Zio

Lingua

English

DOI

10.36253/978-88-5518-461-8.33

Opera sottoposta a peer review

Anno di pubblicazione

2021

Copyright

© 2021 Author(s)

Licenza d'uso

CC BY 4.0

Licenza dei metadati

CC0 1.0

Informazioni bibliografiche

Titolo del libro

ASA 2021 Statistics and Information Systems for Policy Evaluation

Sottotitolo del libro

BOOK OF SHORT PAPERS of the on-site conference

Curatori

Bruno Bertaccini, Luigi Fabbris, Alessandra Petrucci

Opera sottoposta a peer review

Anno di pubblicazione

2021

Copyright

© 2021 Author(s)

Licenza d'uso

CC BY 4.0

Licenza dei metadati

CC0 1.0

Editore

Firenze University Press

DOI

10.36253/978-88-5518-461-8

eISBN (pdf)

978-88-5518-461-8

eISBN (xml)

978-88-5518-462-5

Collana

Proceedings e report

ISSN della collana

2704-601X

e-ISSN della collana

2704-5846

364

Download dei libri

567

Visualizzazioni

Salva la citazione

1.389

Libri in accesso aperto

in catalogo

2.597

Capitoli di Libri

4.205.799

Download dei libri

4.980

Autori

da 1067 Istituzioni e centri di ricerca

di 66 Nazioni

69

scientific boards

da 376 Istituzioni e centri di ricerca

di 43 Nazioni

1.304

I referee

da 398 Istituzioni e centri di ricerca

di 38 Nazioni