GNN-Based Skyline Query Processing for Large-Scale and Incomplete Graphs

Authors

DOI:

https://doi.org/10.31436/iiumej.v27i1.3717

Keywords:

Skyline query processing, Graph Neural Networks (GNNs), Incomplete data, Pareto optimality, Machine learning

Abstract

Skyline queries are crucial in database management, selecting optimal points from multi-dimensional datasets based on dominance relationships. They are widely used in decision-making, recommendation systems, and data filtering. However, traditional skyline algorithms struggle with large volumes and missing data, leading to high computational costs and inefficiencies. This research proposes a hybrid approach that integrates the ISkyline dominance graph technique with Graph Neural Networks (GNNs) to improve skyline query performance under such conditions. The GNN component is utilized to predict skyline tuples in the presence of missing or incomplete data. Evaluation on both synthetic and real-world datasets demonstrates improved accuracy and efficiency compared with established methods such as ISkyline, SIDS, and OIS. This research demonstrates the potential to improve query processing efficiency and to support applications in e-commerce, finance, and smart data systems.

ABSTRAK: Pertanyaan latar langit adalah penting dalam pengurusan pangkalan data, iaitu dengan memilih titik optimum daripada set data berbilang dimensi berdasarkan hubungan dominasi. Ia digunakan secara meluas dalam membuat keputusan, sistem pengesyoran, dan penapisan data. Walau bagaimanapun, algoritma latar langit tradisional bergelut dengan kuantiti data yang besar dan data menghilang, membawa kepada peningkatan kos pengiraan dan ketidakcekapan. Kajian ini mencadangkan pendekatan hibrid yang mengintegrasi teknik graf penguasaan ISkyline dengan Rangkaian Graf Neural (GNNs) bagi meningkatkan prestasi pertanyaan latar langit berkeadaan sedemikian. Komponen GNN digunakan bagi meramalkan tupel latar langit dengan kehadiran data menghilang atau tidak lengkap. Penilaian pada kedua-dua set data sintetik dan dunia nyata menunjukkan peningkatan ketepatan dan kecekapan jika dibandingkan dengan kaedah sedia ada seperti ISkyline, SIDS dan OIS. Kajian ini menunjukkan potensi bagi mencipta pemprosesan pertanyaan yang lebih cekap, menyokong aplikasi e-dagang, kewangan dan sistem data pintar.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Wang, Y., Shi, Z., Wang, J., Sun, L., & Song, B. (2017). Skyline Preference Query Based on Massive and Incomplete Dataset. IEEE Access, 5, 3183–3192. https://doi.org/10.1109/access.2016.2639558

Khalefa, M. E., Mokbel, M. F., & Levandoski, J. J. (2008). Skyline Query Processing for Incomplete Data. https://doi.org/10.1109/icde.2008.4497464

Bharuka, R., & Kumar, P. S. (2013a). Finding superior skyline points from incomplete data. 35–44. https://doi.org/10.5555/2694476.2694488

Afifi, H., Pochaba, S., Boltres, A., Laniewski, D., Haberer, J., Leonard, P., Poorzare, R., Stolpmann, D., Wehner, N., Redder, A., Samikwa, E., & Seufert, M. (2024). Machine Learning with Computer Networks: Techniques, Datasets and Models. IEEE Access, 1–1. https://doi.org/10.1109/access.2024.3384460

Mohamud, M. A., Ibrahim, H., Sidi, F., Mohd, N., Dzolkhifli, Z., Zhang, X., & Lawal, M. M. (2023). A Systematic Literature Review of Skyline Query Processing Over Data Stream. IEEE Access, 11, 72813–72835. https://doi.org/10.1109/access.2023.3295117

Borzsony, S., Kossmann, D., & Stocker, K. (2001). The Skyline operator. Proceedings 17th International Conference on Data Engineering. https://doi.org/10.1109/icde.2001.914855

Bharuka, R., & Kumar, P. S. (2013b). Finding skylines for incomplete data. 109–117.

Hevner, A., March, S., Park, J., & Ram, S. (2004). Design Science in Information Systems Research. MIS Quarterly, 28(1), 75–105. https://doi.org/10.2307/25148625

vom Brocke, J., Hevner, A., & Maedche, A. (2020). Introduction to Design Science Research. Progress in IS, 1–13. https://doi.org/10.1007/978-3-030-46781-4_1

Hevner, A., & Chatterjee, S. (2010). Design science research in information systems. In Design Research in Information Systems. Springer, Boston, MA., 9–22.

Putten, P. (2000). Insurance Company Benchmark (COIL 2000) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5630S

(2000) The Basketball-Reference website. [Online]. Available: https://www.basketball-reference.com

(2025) The GroupLens website. [Online]. Available: https://grouplens.org/datasets/movielens/

Downloads

Published

2026-01-12

How to Cite

Adzman, H. K., Hassan, R., & Dwi Handayani, D. O. (2026). GNN-Based Skyline Query Processing for Large-Scale and Incomplete Graphs. IIUM Engineering Journal, 27(1), 27–47. https://doi.org/10.31436/iiumej.v27i1.3717

Issue

Section

Electrical, Computer and Communications Engineering