{"title":"Technical Perspective: Unicorn: A Unified Multi-Tasking Matching Model","authors":"A. Doan","doi":"10.1145/3665252.3665262","DOIUrl":"https://doi.org/10.1145/3665252.3665262","url":null,"abstract":"Data integration has been a long-standing challenge for data management. It has recently received significant attention due to at least three main reasons. First, many data science projects require integrating data from disparate sources before analysis can be carried out to extract insights. Second, many organizations want to build knowledge graphs, such as Customer 360s, Product 360s, and Supplier 360s, which capture all available information about the customers, products, and suppliers of an organization. Building such knowledge graphs often requires integrating data from multiple sources. Finally, there is also an increasing need to integrate a massive amount of data to create training data for AI models, such as large language models.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"25 20","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140980340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: Allocating Isolation Levels to Transactions in a Multiversion Setting","authors":"Alan D. Fekete","doi":"10.1145/3665252.3665256","DOIUrl":"https://doi.org/10.1145/3665252.3665256","url":null,"abstract":"Among the ways a database management system adds value, is the transaction abstraction, where the application coder can group together multiple data accesses that collectively perform one meaningful real-world activity. The platform will provide the \"ACID\"properties (atomic, consistent, isolated and durable) so the whole transaction happens like a single event. The mechanisms that allow this appearance include crash recovery and rollback (usually based on log entries) and concurrency control (typically involving locks).","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"17 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140979971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: From Binary Join to Free Join","authors":"Thomas Neumann","doi":"10.1145/3665252.3665258","DOIUrl":"https://doi.org/10.1145/3665252.3665258","url":null,"abstract":"Most queries access data from more than one relation, which makes joins between relations an extremely common operation. In many cases the execution time of a query is dominated by the processing of the involved joins. This observation has led to a wide range of techniques to speed up join processing like, e.g. efficient hash joins, bitmap filters to eliminate non-joining tuples early on, blocked lookups to hide cache latencies, and many others.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"34 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140981123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: A Fresh Look at Stream Computation through DSP Glasses","authors":"Dan Olteanu","doi":"10.1145/3665252.3665270","DOIUrl":"https://doi.org/10.1145/3665252.3665270","url":null,"abstract":"DBSP (Data Base Stream Processing) is a simple yet expressive language for stream computation that draws inspiration from DSP (Digital Signal Processing). In DBSP, stream computation is expressed using circuits of stream operators whose input and output are (possibly infinite) sequences of database updates.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"28 37","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140980096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Report on the Workshop on Factorized Databases","authors":"Dan Olteanu","doi":"10.1145/3615952.3615967","DOIUrl":"https://doi.org/10.1145/3615952.3615967","url":null,"abstract":"The workshop took place in Zurich and online from August 2 to 4, 2022. It was attended by researchers from 17 academic institutions and industry labs, including Microsoft Gray Systems Lab, Omics Data Automation, Oracle Labs Zurich, RelationalAI, and TigerGraph. It featured 18 talks and plenty of opportunities for discussions. The vast majority of participants attended in person.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124672576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reminiscences on Influential Papers","authors":"Ashraf Aboulnaga","doi":"10.1145/3615952.3615958","DOIUrl":"https://doi.org/10.1145/3615952.3615958","url":null,"abstract":"This issue's contributors highlight their influences when it comes to their research agenda on parallel data processing and skyline queries, respectively. Enjoy reading! While I will keep inviting members of the data management community, and neighboring communities, to contribute to this column, I also welcome unsolicited contributions. Please contact me if you are interested.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125730403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea De Angelis, Maurizio Mazzei, Federico Piai, P. Merialdo, Giovanni Simonini, Luca Zecchini, S. Bergamaschi, D. Firmani, Xu Chu, Peng Li, Renzhi Wu
{"title":"Experiences and Lessons Learned from the SIGMOD Entity Resolution Programming Contests","authors":"Andrea De Angelis, Maurizio Mazzei, Federico Piai, P. Merialdo, Giovanni Simonini, Luca Zecchini, S. Bergamaschi, D. Firmani, Xu Chu, Peng Li, Renzhi Wu","doi":"10.1145/3615952.3615965","DOIUrl":"https://doi.org/10.1145/3615952.3615965","url":null,"abstract":"We report our experience in running three editions (2020, 2021, 2022) of the SIGMOD programming contest, a well-known event for students to engage in solving exciting data management problems. During this period we had the opportunity of introducing participants to the entity resolution task, which is of paramount importance in the data integration community. We aim at sharing the executive decisions, made by the people coauthoring this report, and the lessons learned.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"736 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133970022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mid-Career Academics: What I Have Learned or Wish I Had Known","authors":"Qiong Luo","doi":"10.1145/3615952.3615960","DOIUrl":"https://doi.org/10.1145/3615952.3615960","url":null,"abstract":"As I agreed on contributing a piece to this series of which Tamer is in charge, I read all the preceding articles to get inspiration. The impression I got was, \"gosh, I wish I had known all this back in my mid-career days and even these days!\" For example, I did place students on my collaborative projects, but would have made faster progress in some projects if I had insisted on having weekly meetings. Or I should have considered a more detailed list of factors before committing to a task of considerable amount of work.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129697077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Amer-Yahia, D. Agrawal, Yael Amsterdamer, S. Bhowmick, Jesús Camacho-Rodríguez, B. Catania, K. Panos, Chrysanthis, C. Curino, J. Darmont, G. Dobbie, A. E. Abbadi, Avrilia, Floratou, Juliana Freire, Alekh Jindal, V. Kalogeraki, Sujaya Maiyya, Alexandra, Meliou, Madhulika Mohanty, Behrooz Omidvar-Tehrani, Fatma Özcan, L. Peterfreund, Wenny Rahayu, S. Sadiq, Sana Sellami, Utku Sirin, Wang-Chiew Tan, Bhavani, Thuraisingham, Neeraja, Yadwadkar, Victor Zakhary, Meihui Zhang
{"title":"Diversity, Equity and Inclusion Activities in Database Conferences: A 2022 Report","authors":"S. Amer-Yahia, D. Agrawal, Yael Amsterdamer, S. Bhowmick, Jesús Camacho-Rodríguez, B. Catania, K. Panos, Chrysanthis, C. Curino, J. Darmont, G. Dobbie, A. E. Abbadi, Avrilia, Floratou, Juliana Freire, Alekh Jindal, V. Kalogeraki, Sujaya Maiyya, Alexandra, Meliou, Madhulika Mohanty, Behrooz Omidvar-Tehrani, Fatma Özcan, L. Peterfreund, Wenny Rahayu, S. Sadiq, Sana Sellami, Utku Sirin, Wang-Chiew Tan, Bhavani, Thuraisingham, Neeraja, Yadwadkar, Victor Zakhary, Meihui Zhang","doi":"10.1145/3615952.3615964","DOIUrl":"https://doi.org/10.1145/3615952.3615964","url":null,"abstract":"The Diversity, Equity and Inclusion (DEI) initiative started as the Diversity/Inclusion initiative in 2020 [4]. The current report summarizes our activities in 2022. Our responsibility as a community is to ensure that attendees of DB conferences feel included, irrespective of their scientific perspective and personal background. One of the first steps was to establish the role of the DEI chairs at DB Conferences, with the DEI team dedicated to providing leadership to help our community achieve this goal. In this leadership role, the DEI team is advising DEI chairs at DB conferences, serving as a memory of DEI events at conferences, building an agreed-upon vision, and committing to working together to devise a set of measures for achieving DEI. That is pursued via actions led by our core members (Figure 1) and liaisons of individual executive bodies (Figure 2): REACH OUT collects data and experiences from our community. INCLUDE monitors and recommends inclusion efforts. ORGANIZE focuses on in-conference organization efforts, such as adopting a code of conduct. INFORM communicates through various channels. SUPPORT coordinates DEI support from executive bodies and sponsors. SCOUT collates DEI efforts from other communities. COORDINATE manages all actions. Two new actions: MEDIA preserves and disseminates the digital media produced by DEI@DB events. ETHICS establishes and promotes ethics guidelines for publications in our community.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128374334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thoralf Mildenberger, Martin Braschler, A. Ruckstuhl, R. Vorburger, Kurt Stockinger
{"title":"The Role of Data Scientists in Modern Enterprises - Experience from Data Science Education","authors":"Thoralf Mildenberger, Martin Braschler, A. Ruckstuhl, R. Vorburger, Kurt Stockinger","doi":"10.1145/3615952.3615966","DOIUrl":"https://doi.org/10.1145/3615952.3615966","url":null,"abstract":"\"Data Scientist\" has often been considered as the sexiest job of the 21st century. As a consequence, the spectrum of data science education programs has increased significantly in recent years, and there is a high demand for data scientists at many companies. However, what training is required to become a data scientist? What is the role of data scientists in current enterprises? Is the training well-aligned to the practical needs of a job? In this article, we will address these questions by evaluating a survey of people who were trained in a continuing education program in data science in Switzerland. Our study sheds lights on the practical aspects of the data science education and how this newly-gained knowledge can successfully be applied in an enterprise. One of the highlights from the point of view of the database community is the important role of SQL in data science.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130901695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}