Giovanni Simonini

UNIMORE, Modena (Italy) simonini _at_ unimore.it

Short Bio

I am Senior Researcher (RTD b) at the University of Modena and Reggio Emilia. Before that, I was a postdoctoral associate at MIT CSAIL, working with Prof. Michael Stonebraker. I received the PhD degree in Computer Science from the University of Modena in 2016. My doctoral dissertation won the PhD Thesis Award from the IEEE Computer Society Italy Section. My research interests include data integration and big data management.

My research:

News

  • [Feb '21] SIGMOD '21 Programming Contest is online: contest page.
  • [Sep '20] Serving as Programming Competition Co-chair at SIGMOD '21.
  • [Jun '20] My master student Luca Zecchini is runner-up (2nd prize) in this year Programming Competition at SIGMOD '20.
  • [Jun '20] "Three-dimensional Entity Resolution with JedAI" accepted at Information Systems.
  • [May '20] "BLAST2: an Efficient Technique for Loose Schema Information Extraction from heterogeneous Big Data sources" accepted at ACM JIDQ.
  • [Jan '20] "RulER: Scaling Up Record-level Matching Rules" accepted as a demo at EDBT 2020.
  • [Jan '20] "JedAI^3: beyond batch, blocking-based Entity Resolution" accepted as a demo at EDBT 2020.
  • [Oct '19] "Dagger: A Data (not code) Debugger" accepted at CIDR 2020.
  • [Sep '19] Italian National Scientific Habilitation ("ASN") as Associate Professor -- Sector 09/H1, valid until 09/09/2025 -- [certificate] .
  • [May '19] "Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics" accepted as a demo at VLDB 2019.
  • [Mar '19] Paper accepted at Inf. Syst.: "Scaling entity resolution: A loosely schema-aware approach" -- G. Simonini, L. Gagliardelli, S. Bergamaschi, H. V. Jagadish. [DOI]

Service

  • Programming Contest Co-chair at SIGMOD 2021
  • Program Chair of BDAA 2018 (as part of IEEE HPCS 2018)
  • Technical Session Chair “Big Data Integration and IoT for Smart Health Care”, IEEE RTSI 2017
  • Reviewer (journals): Transactions on Knowledge and Data Engineering, Data & Knowledge Engineering, Journal of Data and Information Quality
  • Reviewer (conferences): SIGMOD 2021, EDBT 2021, SEAdata@EDBT 2020, DASFAA 2019 (DEMO), MEDI 2018, BDAA/HPCS 2014/2015/2017/2018, ICDE 2018 (external), VLDB 2014 (external)

  • Publications

    [j] Journal -- [c] Conference -- [b] Book Chapter -- [t] Thesis
    • [j8] Three-dimensional Entity Resolution with JedAI. - Information Systems, Volume 93, November 2020, 101565 - G. Papadakis, G. Mandilaras, L. Gagliardelli, G. Simonini, E. Thanos, G. Giannakopoulos, S. Bergamaschi, T. Palpanas, M. Koubarakis. https://doi.org/10.1016/j.is.2020.101565
    • [j7] BLAST2: an Efficient Technique for Loose Schema Information Extraction from heterogeneous Big Data sources. - Journal of Data and Information Quality , 12, 4, Article 18 (November 2020) - D. Beneventano, S. Bergamaschi, L. Gagliardelli, G. Simonini. https://doi.org/10.1145/3394957
    • [c15] RulER: Scaling Up Record-level Matching Rules, EDBT 2020 (demo). - L. Gagliardelli, G. Simonini and S. Bergamaschi.
    • [c14] JedAI^3: beyond batch, blocking-based Entity Resolution, EDBT 2020 (demo). - G. Papadakis, L. Tsekouras, E. Thanos, N. Pittaras, G. Simonini, D. Skoutas, P. Isaris, G. Giannakopoulos, T. Palpanas and M. Koubarakis.
    • [c13] Dagger: A Data (not code) Debugger, CIDR 2020. - E. Rezig, L. Cao, G. Simonini, M. Schoemans, S. Madden, M. Ouzzani, N. Tang, M. Stonebraker.
    • [c12] Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics, PVLDB 2019 (DEMO). - E. Rezig, L. Cao, M. Stonebraker, G. Simonini, W. Tao, S. Madden, M. Ouzzani, N. Tang, A. K. Elmagarmid.
    • [j6] Scaling entity resolution: A loosely schema-aware approach. - Information Systems, Volume 83, July 2019, Pages 145-165 - G. Simonini, L. Gagliardelli, S. Bergamaschi, H. V. Jagadish. https://doi.org/10.1016/j.is.2019.03.006
    • [c11] SparkER: Scaling Entity Resolution in Spark, EDBT 2019 (DEMO). - L. Gagliardelli, G. Simonini, D. Beneventano, S. Bergamaschi. [PDF]
    • [j5] Schema-agnostic Progressive Entity Resolution. - IEEE TKDE. - G. Simonini, G. Papadakis, T. Palpanas, S. Bergamaschi https://ieeexplore.ieee.org/document/8403302/
    • [j4] Computing inter-document similarity with Context Semantic Analysis. - Information Systems, Volume 80, February 2019, Pages 136-147 - F. Benedetti, D. Beneventano, S. Bergamaschi, G. Simonini https://doi.org/10.1016/j.is.2018.02.009. [PDF]
    • [c10] Schema-agnosticProgressiveEntityResolution.-IEEEInternationalConf.onDataEngineering, ICDE 2018. - G. Simonini, G. Papadakis, T. Palpanas, S. Bergamaschi. [PDF]
    • [c9] Enhancing Loosely Schema-aware Entity Resolution with User Interaction. - IEEE International Conf. on High Performance Computing & Simulation, HPCS 2018. - G. Simonini, L. Gagliardelli, S. Zhu, S. Bergamaschi. [PDF]
    • [c8] How to improve Set Similarity Join based on prefix approach in distributed environment.-IEEE International Conf. on High Performance Computing & Simulation, HPCS 2018. - S. Zhu, L. Gagliardelli, D. Beneventano, G. Simonini. [PDF]
    • [b2] Enhancing Big Data Exploration with Faceted Browsing. - “Classification, (Big) Data Analysis and Statistical Learning”, Springer. - G. Simonini, S. Zhu, S. Bergamaschi. [PDF]
    • [b1] From Data Integration to Big Data Integration. - “A Comprehensive Guide Through the Italian Database Research 2018”, Springer. - S. Bergamaschi, D. Beneventano, F. Mandreoli, R. Martoglia, F. Guerra, M. Orsini, L. Po, M. Vincini, G. Simonini, S. Zhu, L. Gagliardelli, L. Magnotta. [PDF]
    • [c7] SOPJ: A Scalable Online Provenance Join for Data Integration. - IEEE International Conf. on High Performance Computing & Simulation, HPCS 2017. - S. Zhu, G. Fiameni, G. Simonini, S. Bergamaschi. [PDF]
    • [c6] BigBench workload executed by using Apache Flink - Procedia Manufacturing 11 (2017): 695- 702. - S. Bergamaschi, L. Gagliardelli, G. Simonini, S. Zhu [PDF]
    • [t] Loosely Schema-aware Techniques for Big Data Integration - PhD Thesis (2016) - G. Simonini [PDF]
    • [j3] BLAST: a loosely schema-aware meta-blocking approach for entity resolution - PVLDB 9.12 (2016): 1173-1184. - G. Simonini, S. Bergamaschi, H. V. Jagadish [PDF] [CODE]
    • [j2] Providing Insight into Data Source Topics - Journal on Data Semantics (2016): 1-18. - S. Bergamaschi, F. Guerra, D. Ferrari, G. Simonini, Y. Velegrakis [PDF]
    • [c5] Big data exploration with faceted browsing - IEEE International Conf. on High Performance Computing & Simulation, HPCS 2015 - G. Simonini, Z. Song [PDF]
    • [c4] Discovering the topics of a data source: a statistical approach - SWSD Workshop @ISWC 2014 - G. Simonini, F. Guerra, S. Bergamaschi [PDF]
    • [c3] Towards Declarative Imperative Data-parallel Systems - Italian Symposium on Advanced Database Systems, SEBD 2014 - M. Interlandi, G. Simonini, S. Bergamaschi [PDF]
    • [c2] Using big data to support automatic Word Sense Disambiguation - IEEE International Conf. on High Performance Computing & Simulation, HPCS 2014 - G. Simonini, F. Guerra [PDF]
    • [j1] Supporting Image Search with Tag Clouds: a Preliminary Approach - Advances in Multimedia, 2015 - G. Simonini, F. Guerra, M. Vincini [PDF]
    • [c1] Keyword Searchover Relational Databases: Issues, Approachesand Open Challenges-Bridging Between Information Retrieval and Databases. Springer, Berlin, Heidelberg, 2014. 54-73. - S. Bergamaschi, F. Guerra, G. Simonini [PDF]

    Awards & Grants

    • 35,000€ research grant from University of Modena and Reggio Emilia (FAR Junior call), for my research project about application-driven data cleaning. (2018)
    • IEEE Computer Society Italy Section Chapter PhD Thesis Award (2017)
    • Certificate of merit for national and international research from University of Modena and Reggio Emilia (2017)
    • VLDB 2016 Travel Fellowship (2016)
    • Spinner 2013 Scholarship -- one year grant from "Regione Emilia Romagna" (2013)

    Teaching

    For Students

    Office Loc.: Building MO27, 1st floor - Via P. Vivarelli 10, Modena (Dipartimento di Ingegneria "Enzo Ferrai")
    Office Hours: Thu 4pm-6pm (please email in advance to schedule)