Short Bio

I am an Associate Professor at the University of Modena and Reggio Emilia. Before that, I was a postdoctoral associate in the Data System Group at MIT. I received the PhD degree in Computer Science from the University of Modena in 2016. My doctoral dissertation won the PhD Thesis Award from the IEEE Computer Society Italy Section. I held visiting positions at the DB group of the Univ. of Michigan and the Data Analytics group of the Qatar Computing Research Institute. My research interests include data integration and big data management for AI.

News


    [Feb '24] "Determining the Largest Overlap between Tables" will be a paper of this year's SIGMOD.
    [Jun '23] BrewER will be a demo at this year's VLDB (video, code).
    [May '23] Glad to be among the Research Track Best Reviewers at ICDE 2023.
    [May '22] Panelist at ICDE for a round table on Data Extraction, Integration and Cleaning.
    [Apr '22] Second paper accepted at this year's VLDB: "Generalized Supervised Meta-blocking".
    [Feb '22] Paper "Entity Resolution On-demand" accepted at VLDB 2022, see you in Sydney.
    [Feb '22] I'm co-chair of the SIGMOD '22 Programming Contest: contest page.

    [Feb '21] SIGMOD '21 Programming Contest is online: contest page.
    [Sep '20] Serving as Programming Competition Co-chair at SIGMOD '21.
    [Jun '20] My master student Luca Zecchini is runner-up (2nd prize) in this year Programming Competition at SIGMOD '20.
    [Jun '20] "Three-dimensional Entity Resolution with JedAI" accepted at Information Systems.
    [May '20] "BLAST2: an Efficient Technique for Loose Schema Information Extraction from heterogeneous Big Data sources" accepted at ACM JIDQ.
    [Jan '20] "RulER: Scaling Up Record-level Matching Rules" accepted as a demo at EDBT 2020.
    [Jan '20] "JedAI^3: beyond batch, blocking-based Entity Resolution" accepted as a demo at EDBT 2020.
    [Jad '20] "Dagger: A Data (not code) Debugger" presented at CIDR 2020.
    [Sep '19] Italian National Scientific Habilitation ("ASN") as Associate Professor -- Sector 09/H1, valid until 09/09/2025 -- [certificate] .
    [May '19] "Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics" accepted as a demo at VLDB 2019.
    [Mar '19] Paper accepted at Inf. Syst.: "Scaling entity resolution: A loosely schema-aware approach" -- G. Simonini, L. Gagliardelli, S. Bergamaschi, H. V. Jagadish. [DOI]
    [Dec '18] "SparkER: Scaling Entity Resolution in Spark" accepted as a demo at EDBT 2019. Code for scaling Entity Resolution on Apache Spark on [github]
    [Sep '18] Starting a postdoc at the MIT CSAIL, joining the database group.
    [Sep '18] My research project about application-driven data cleaning has been funded by UniMoRe FAR (35.000 €).
    [Jun '18] Paper accepted at IEEE TKDE: "Schema-agnostic Progressive Entity Resolution" (extended version) -- G. Simonini, G. Papadakis, T. Palpanas, S. Bergamaschi
    [Feb '18] Paper accepted at Inf. Syst.: "Computing inter-document similarity with Context Semantic Analysis" -- F. Benedetti, D. Beneventano, S. Bergamaschi, G. Simonini
    [Dec '17] Paper accepted at ICDE: "Schema-agnostic Progressive Entity Resolution" -- G. Simonini, G. Papadakis, T. Palpanas, S. Bergamaschi

Service

Organization
  • VLDB 2025 Proceeding Co-chair
  • Editorial Review Board of the ACM Journal of Data and Information Quality (2024)
  • SIGMOD 2022 Programming Contest Co-chair
  • SIGMOD 2021 Programming Contest Co-chair
  • Program Chair of BDAA 2018 (as part of IEEE HPCS 2018)
  • Technical Session Chair “Big Data Integration and IoT for Smart Health Care”, IEEE RTSI 2017

  • Program Committee Member
  • VLDB 2023, 2025
  • SIGMOD 2023, 2025
  • ICDE 2022, 2023 (Best Reviewers Award)
  • EDBT 2021, 2022 (demo), 2024
  • TheWebConf 2022,2024 (Industry)
  • SEBD 2021-22,2024
  • SEAdata@VLDB 2020-21
  • DASFAA 2019 (DEMO)
  • MEDI 2018
  • BDAA HPCS 2014-18
  • VLDB 2014 (external reviewer)

  • Reviewer (journals)
  • IEEE TKDE
  • Neurocomputing
  • DKE
  • ACM JDIQ
  • Publications

      2022
    • Entity Resolution On-demand (new)
      PVLDB
      G. Simonini, L. Zecchini, S. Bergamaschi, F. Naumann
    • Generalized Supervised Meta-blocking (new)
      PVLDB
      L. Gagliardelli, G. Papadakis, G. Simonini, S. Bergamaschi, T. Palpanas


    • 2021
    • Reproducible Experiments on Three-Dimensional Entity Resolution with JedAI
      Information Systems
      G. Mandilaras, G. Papadakis, L. Gagliardelli, G. Simonini, E. Thanos, G. Giannakopoulos, S. Bergamaschi, T. Palpanas, M. Koubarakis, A. Lara-Clares, A. Farina.
    • The Case for Multi-task Active Learning Entity Resolution
      SEBD
      G. Simonini, S. Henrique, L. Gagliardelli, L. Zecchini, D. Beneventano, S. Bergamaschi


    • 2020
    • Dagger: A Data (not code) Debugger
      CIDR
      E. Rezig, L. Cao, G. Simonini, M. Schoemans, S. Madden, M. Ouzzani, N. Tang, M. Stonebraker.
    • Three-dimensional Entity Resolution with JedAI
      Information Systems
      G. Papadakis, G. Mandilaras, L. Gagliardelli, G. Simonini, E. Thanos, G. Giannakopoulos, S. Bergamaschi, T. Palpanas, M. Koubarakis.
    • BLAST2: an Efficient Technique for Loose Schema Information Extraction from heterogeneous Big Data sources
      ACM Journal of Data and Information Quality
      D. Beneventano, S. Bergamaschi, L. Gagliardelli, G. Simonini.
    • RulER: Scaling Up Record-level Matching Rules
      EDBT
      L. Gagliardelli, G. Simonini and S. Bergamaschi.
    • JedAI^3: beyond batch, blocking-based Entity Resolution
      EDBT
      G. Papadakis, L. Tsekouras, E. Thanos, N. Pittaras, G. Simonini, D. Skoutas, P. Isaris, G. Giannakopoulos, T. Palpanas and M. Koubarakis.
    • Scaling Up Record-level Matching Rules
      SEBD
      L. Gagliardelli, G. Simonini and S. Bergamaschi.
    • Entity Resolution on Camera Records Without Machine Learning
      DI2KG@VLDB
      L. Zecchini, G. Simonini, S. Bergamaschi


    • 2019
    • Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics
      PVLDB
      E. Rezig, L. Cao, M. Stonebraker, G. Simonini, W. Tao, S. Madden, M. Ouzzani, N. Tang, A. K. Elmagarmid.
    • Scaling entity resolution: A loosely schema-aware approach
      Information Systems
      G. Simonini, L. Gagliardelli, S. Bergamaschi, H. V. Jagadish.
    • SparkER: Scaling Entity Resolution in Spark
      EDBT
      L. Gagliardelli, G. Simonini, D. Beneventano, S. Bergamaschi.
    • Schema-agnostic Progressive Entity Resolution
      IEEE TKDE
      G. Simonini, G. Papadakis, T. Palpanas, S. Bergamaschi
    • Computing inter-document similarity with Context Semantic Analysis
      Information Systems
      F. Benedetti, D. Beneventano, S. Bergamaschi, G. Simonini


    • 2018
    • Schema-agnostic Progressive Entity Resolution
      ICDE
      G. Simonini, G. Papadakis, T. Palpanas, S. Bergamaschi
    • Enhancing Loosely Schema-aware Entity Resolution with User Interaction
      HPCS
      G. Simonini, L. Gagliardelli, S. Zhu, S. Bergamaschi
    • How improve Set Similarity Join based on prefix approach in distributed environment
      HPCS
      S. Zhu, L. Gagliardelli, D. Beneventano, G. Simonini
    • Enhancing Big Data Exploration with Faceted Browsing
      Classification, (Big) Data Analysis and Statistical Learning (Springer)
      G. Simonini, S. Zhu, S. Bergamaschi.
    • From Data Integration to Big Data Integration
      A Comprehensive Guide Through the Italian Database Research 2018, Springer - S. Bergamaschi, D. Beneventano, F. Mandreoli, R. Martoglia, F. Guerra, M. Orsini, L. Po, M. Vincini, G. Simonini, S. Zhu, L. Gagliardelli, L. Magnotta. [PDF]


    • 2017
    • SOPJ: A Scalable Online Provenance Join for Data Integration
      HPCS
      S. Zhu, G. Fiameni, G. Simonini, S. Bergamaschi
    • BigBench Workload Executed by Using Apache Flink
      Procedia Manufacturing
      S. Bergamaschi, L. Gagliardelli, G. Simonini, S. Zhu


    • 2016
    • Loosely Schema-aware Techniques for Big Data Integration
      PhD Thesis Best CS Thesis in Italy from IEEE
      G. Simonini
    • BLAST: a loosely schema-aware meta-blocking approach for entity resolution
      PVLDB
      G. Simonini, S. Bergamaschi, H. V. Jagadish
    • Providing Insight into Data Source Topics
      Journal on Data Semantics
      S. Bergamaschi, F. Guerra, D. Ferrari, G. Simonini, Y. Velegrakis
    • Big data exploration with faceted browsing - HPCS - G. Simonini, Z. Song
    • Discovering the topics of a data source: a statistical approach - SWSD Workshop @ISWC 2014 - G. Simonini, F. Guerra, S. Bergamaschi [PDF]
    • Towards Declarative Imperative Data-parallel Systems - Italian Symposium on Advanced Database Systems, SEBD 2014 - M. Interlandi, G. Simonini, S. Bergamaschi [PDF]
    • Using big data to support automatic Word Sense Disambiguation - IEEE International Conf. on High Performance Computing & Simulation, HPCS 2014 - G. Simonini, F. Guerra [PDF]
    • Supporting Image Search with Tag Clouds: a Preliminary Approach - Advances in Multimedia, 2015 - G. Simonini, F. Guerra, M. Vincini [PDF]
    • Keyword Searchover Relational Databases: Issues, Approachesand Open Challenges-Bridging Between Information Retrieval and Databases. Springer, Berlin, Heidelberg, 2014. 54-73. - S. Bergamaschi, F. Guerra, G. Simonini [PDF]

    Awards & Grants

    • Menzione "Premio Giovani Ricercatori" 2020 Gruppo 2003 (2020)
    • 35,000€ research grant from University of Modena and Reggio Emilia (FAR Junior call), for my research project about application-driven data cleaning. (2018)
    • IEEE Computer Society Italy Section Chapter PhD Thesis Award (2017)
    • Certificate of merit for national and international research from University of Modena and Reggio Emilia (2017)
    • VLDB 2016 Travel Fellowship (2016)
    • Spinner 2013 Scholarship -- one year grant from "Regione Emilia Romagna" (2013)

    Teaching

    For Students

    Office Loc.: Dept. Eng. "Enzo Ferrai" and Dept. Econ. "Marco Biagi" (please email in advance to schedule).
    Office Hours: Thu 4pm-6pm (please email in advance to schedule)

    Courses

  • Human Resource Information Systems e Data Science
    (ITA) Dept. Econ. "Marco Biagi"

  • Metodi quantitativi e Computer Science
    (ITA) Dept. Econ. "Marco Biagi"

  • Basi di Dati
    (ITA) Dept. Eeng. "Enzo Ferrari" Sede di Mantova

  • Big Data Management and Governance
    (ENG) Dept. Eeng. "Enzo Ferrari"