circle-symbol
  • Home
  • Awards
  • Process
  • Guidelines
  • SIGMOD ARI
    • 2025
    • 2024
    • Past
  • 2025
  • 2024
  • History
  • Reports & Reproduced Papers

Badges and Reproducibility Reports

The reviewers submit their reports, which are published in this page. Following is the list of papers that passed the reproducibility test, the functionality test, and/or have made their artifacts available, along with the reproducibility reports (when applicable) and the badges awarded. The reproducibility reports are maintained here starting from reproduced papers of SIGMOD 2020.

To access reproduced/available papers, please click:

  • All reproduced PACMMOD papers (2023-onwards)
  • All PACMMOD papers with artifacts available (2023-onwards)
  • All reproduced SIGMOD papers (until 2022)
  • All SIGMOD papers with artifacts available (until 2022)

ACM SIGMOD 2024 Badges and Reproducibility Results

To access the reproducibility reports, please click here.

Paper Title Report Badges Awarded
Proximity Queries on Point Clouds using Rapid Construction Path Oracle
PimPam: Efficient Graph Pattern Matching on Real Processing-in-Memory Hardware
Optimizing Distributed Protocols with Query Rewrites
Hierarchical Cut Labelling – Scaling Up Distance Queries on Road Networks
Proving Query Equivalence Using Linear Integer Arithmetic
Faster Algorithms for Fair Max-Min Diversification in R^d
Can Learned Indexes be Built Efficiently? A Deep Dive into Sampling Trade-offs
Best Artifact
Query Refinement for Diverse Top-k Selection
Grafite: Taming Adversarial Queries with Optimal Range Filters
FACET: Robust Counterfactual Explanation Analytics
CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models
Honorable Mention for Best Artifact
Temporal JSON Keyword Search
DProvDB: Differentially Private Query Processing with Multi-Analyst Provenance
PECJ: Stream Window Join on Disorder Data Streams with Proactive Error Compensation
NOCAP: Near-Optimal Correlation-Aware Partitioning Joins
Dias: Dynamic Rewriting of Pandas Code
Honorable Mention for Best Artifact
Fast Maximal Quasi-clique Enumeration: A Pruning and Branching Co-Design Approach
MorphStream: Adaptive Scheduling for Scalable Transactional Stream Processing on Multicores (presented in SIGMOD 2023)
PreVision: An Out-of-Core Matrix Computation System with Optimal Buffer Replacement
ALP: Adaptive lossless floating-point compression
Best Artifact
A Unified Approach for Resilience and Causal Responsibility with Integer Linear Programming (ILP) and LP Relaxations
On The Reasonable Effectiveness of Relational Diagrams: Explaining Relational Query Patterns and the Pattern Expressiveness of Relational Languages
StarfishDB: A Query Execution Engine for Relational Probabilistic Programming
On Efficient Large Sparse Matrix Chain Multiplication
Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESet
Revisiting B-tree Compression: An Experimental Study
Spruce: a Fast yet Space-saving Structure for Dynamic Graph Storage
Cabin: a Compressed Adaptive Binned Scan Index
NOC-NOC: Towards Performance-optimal Distributed Transactions
Anchor: A Library for Building Secure Persistent Memory Systems
Origin-Destination Travel Time Oracle for Map-based Services
Implementation Strategies for Views over Property Graphs
Correlation Joins over Time Series Data Streams Utilizing Complementary Dimension Reduction and Transformation
Homomorphic Compression: Making Text Processing on Compression Unlimited
RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search
Generation of Training Examples for Tabular Natural Language Inference
Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control
Approximate Sketches
SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search
Relative Keys: Putting Feature Explanation into Context
Memory-Efficient and Flexible Detection of Heavy Hitters in High-Speed Networks
AirIndex: Versatile Index Tuning Through Data and Storage
Counterfactual Explanation at Will, with Zero Privacy Leakage
Sub-optimal Join Order Identification with L1-error
CAVE: Concurrency-Aware Graph Processing on SSDs
Optimizing Time Series Queries with Versions
LPLM: A Neural Language Model for Cardinality Estimation of LIKE-Queries
Rethinking Learned Cost Models: Why Start from Scratch?
Efficient k-Clique Listing: An Edge-Oriented Branching Strategy
AS-Parser: Log Parsing Based on Adaptive Segmentation
High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Interpolation
view-based explanations for graph neural networks
High-Ratio Compression for Machine-Generated Data
Towards Metric DBSCAN: Exact, Approximate, and Streaming Algorithms
Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and Computation
CodeS: Towards Building Open-source Language Models for Text-to-SQL
HERO: A Hierarchical Set Partitioning and Join Framework for Speeding up the Set Intersection Over Graphs
SWIX: A Memory-efficient Sliding Window Learned Index
Understanding the Performance Implications of the Design Principles in Storage-Disaggregated Databases
ThalamusDB: Approximate Query Processing on Multi-Modal Data
A Comprehensive Survey and Experimental Study of Subgraph Matching: Trends, Unbiasedness, and Interaction
Scalable Distributed Inverted List Indexes in Disaggregated Memory
ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling
Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes
Practical Dynamic Extension for Sampling Indexes
Materialized View Selection & View-Based Query Planning for Regular Path Queries
LST-Bench: Benchmarking Log-Structured Tables in the Cloud
Efficient Core Maintenance in Large Bipartite Graphs
PLATON: Top-down R-tree Packing with Learned Partition Policy
In-Database Data Imputation
Saga: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications
CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage Disaggregated Infrastructure

ACM SIGMOD 2023 Badges and Reproducibility Results

Paper Title Report Badges Awarded
Polaris: Enabling Transaction Priority in Optimistic Concurrency Control
SafeBound: A Practical System for Generating Cardinality Bounds
Efficient Star-based Truss Maintenance on Dynamic Graphs
Efficient GPU-Accelerated Subgraph Matching
T-FSM: A Task-Based System for Massively Parallel Frequent Subgraph Pattern Mining from a Big Graph
Efficient and Effective Algorithms for Generalized Densest Subgraph Discovery
GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by Example
AWARE: Workload-aware, Redundancy-exploiting Linear Algebra
iFlipper: Label Flipping for Individual Fairness
ML2DAC: Meta-learning to Democratize AutoML for Clustering Analysis
LightCTS: A Lightweight Framework for Correlated Time Series Forecasting
DeltaBoost: Gradient Boosting Decision Trees with Efficient Machine Unlearning
Honorable Mention for Best Artifact
A Universal Question-Answering Platform for Knowledge Graphs
InfiniFilter: Expanding Filters to Infinity and Beyond
Best Artifact
EAR-Oracle: On Efficient Indexing for Distance Queries between Arbitrary Points on Terrain Surface
Efficiently Computing Join Orders with Heuristic Search
Ready to Leap (by Co-Design)? Join Order Optimisation on Quantum Hardware
A Step Toward Deep Online Aggregation
Honorable Mention for Best Artifact
High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations
MRVs: Enforcing Numeric Invariants in Parallel Updates to Hotspots with Randomized Splitting
Using Cloud Functions as Accelerator for Elastic Data Analytics
Parallel Strong Connectivity Based on Faster Reachability
INEv: In-Network Evaluation for Event Stream Processing
Discovering Similarity Inclusion Dependencies
DAMR: Dynamic Adjacency Matrix Representation Learning for Multivariate Time Series Imputation
Learned Data-aware Image Representations of Line Charts for Similarity Search
Scapin: Scalable Graph Structure Perturbation by Augmented Influence Maximization
ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency
FactorJoin: A New Cardinality Estimation Framework for Join Queries
Detock: High Performance Multi-region Transactions at Scale
Most Expected Winner: An Interpretation of Winners over Uncertain Voter Preferences
Dumpy: A Compact and Adaptive Index for Large Data Series Collections
ST4ML: Machine Learning Oriented Spatio-Temporal Data Processing at Scale
Generalizing Bulk-Synchronous Parallel Processing for Data Science: From Data to Threads and Agent-Based Simulations
Pea Hash: A Performant Extendible Adaptive Hashing Index
Prerequisite-driven Fair Clustering on Heterogeneous Information Networks
On Querying Connected Components in Large Temporal Graphs
Efficient and Effective Attributed Hypergraph Clustering via 𝐾-Nearest Neighbor Augmentation
LightTS: Lightweight Time Series Classification with Adaptive Ensemble Distillation
How To Optimize My Blockchain? A Multi-Level Recommendation Approach
dsJSON: A Distributed SQL JSON Processor
When Tree Meets Hash: Reducing Random Reads for Index Structures on Persistent Memories
CompressGraph: Efficient Parallel Graph Analytics with Rule-Based Compression
Query-Guided Resolution in Uncertain Databases
LightRW: FPGA Accelerated Graph Dynamic Random Walks
Exploiting Structure in Regular Expression Queries
Grep: A Graph Learning Based Database Partitioning System
Maximum k-Biplex Search on Bipartite Graphs: A Symmetric-BK Branching Approach

ACM SIGMOD 2022 Badges and Reproducibility Results

Paper Title Report Badges Awarded
JEDI: These aren't the JSON documents you're looking for ...
Best Artifact
CoLES: Contrastive Learning for Event Sequences with Self-Supervision
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines
AutoMon: Automatic Distributed Monitoring for Arbitrary Multivariate Functions
Conjunctive Queries with Comparisons
WeTune: Automatic Discovery and Verification of Query Rewrite Rules
Hybrid Deterministic and Nondeterministic Execution of Transactions in Actor Systems
Controlled Intentional Degradation in Analytical Video Systems
Cooperative Route Planning Framework for Multiple Distributed Assets in Maritime Applications
Explaining Link Prediction Systems based on Knowledge Graph Embeddings
dCAM: Dimension-wise Class Activation Map for Explaining Multivariate Data Series Classification
HypeR: Hypothetical Reasoning With What-If and How-To Queries Using a Probabilistic Causal Approach
Honorable Mention for Best Artifact
DataPrism: Exposing Disconnect between Data and Systems
GraphZeppelin: Storage-Friendly Sketching for Connected Components on Dynamic Graph Streams
Rank Aggregation with Proportionate Fairness
Givens QR Decomposition over Relational Databases
TASTI: Semantic Indexes for Machine Learning-based Queries over Unstructured Data
Finding Label and Model Errors in Perception Data With Learned Observation Assertions
Secure and Policy-Compliant Query Processing on Heterogeneous Computational Storage Architecture
Interpretable Data-Based Explanations for Fairness Debugging
Efficient Algorithms for Maximal k-Biplex Enumeration
Efficient Massively Parallel Join Optimization for Large Queries
Redundancy Elimination in Distributed Matrix Computation
Efficient Answering of Historical What-if Queries
HET-GMP: a Graph-based System Approach to Scaling Large Embedding Model Training
SAM: Database Generation from Query Workload with Supervised Autoregressive Model
CompactWalks: Taming Knowledge-Graph Embeddings With Domain- and Task-Specific Pathways

ACM SIGMOD 2021 Badges and Reproducibility Results

Paper Title Report Badges Awarded
Parallelizing Intra-Window Join on Multicores: An Experimental Study
REDS: Rule Extraction for Discovering Scenarios
At-the-time and Back-in-time Persistent Sketches
LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems
One WITH RECURSIVE is Worth Many GOTOs
Conformance Constraint Discovery: Measuring Trust in Data-Driven Systems
Best Artifact
ExDRa: Exploratory Data Science on Federated Raw Data
MxTasks: How to Make Efficient Synchronization and Prefetching Easy
Efficient Graph Summarization using Weighted LSH at Billion-Scale
Putting Things into Context: Rich Explanations for Query Answers using Join Graphs
TreeToaster: Towards an IVM-Optimized Compiler
EIRES: Efficient Integration of Remote Data in Event Stream Processing
SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging
Adaptive Compression for Fast Scans on String Columns
Tuplex: Data Science in Python at Native Code Speed
Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence
Parallel Index-based Structural Graph Clustering and its Approximation
Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds
COMPASS: Online Sketch-based Query Optimization for In-Memory Databases
Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows
Maximizing Persistent Memory Bandwidth Utilization for OLAP Workloads
Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals
HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries
Efficient Exploration of Interesting Aggregates in RDF Graphs
AlphaEvolve: A Learning Framework to Discover Novel Alphas in Quantitative Investment

ACM SIGMOD 2020 Badges and Reproducibility Results

Paper Title Report Badges Awarded
QueryVis: Logic-based Diagrams help Users Understand Complicated SQL Queries Faster
Best Artifact
Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks
Efficient Algorithms for Densest Subgraph Discovery on Large Directed Graphs
Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study
Factorized Graph Representations for Semi-Supervised Learning from Sparse Data
Best Artifact
Functional-Style SQL UDFs With a Capital 'F'
Best Artifact
Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring
BugDoc: Algorithms to Debug Computational Processes
Database Benchmarking for Supporting Real-Time Interactive Querying of Large Data
Automating Incremental and Asynchronous Evaluation for Recursive Aggregate Data Processing
Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects
LISA: A Learned Index Structure for Spatial Data
Timely Reporting of Heavy Hitters using External Memory
Theoretically-Efficient and Practical Parallel DBSCAN
Towards Interpretable and Learnable Risk Analysis for Entity Resolution
BinDex: A Two-Layered Index for Fast and Robust Scans

Copyright © ACM SIGMOD. All Rights Reserved.