Enhancing Skyline Query Processing on Large and Incomplete Graphs with Graph Neural Networks: A Hybrid Machine Learning Approach

Authors

  • Hasan Khair Adzman Department of Computer Science, International Islamic University Malaysia,
  • Raini Hassan Department of Computer Science, International Islamic University Malaysia, Kuala Lumpur, Malaysia
  • Dini Oktarina Dwi Handayani Department of Computer Science, International Islamic University Malaysia, Kuala Lumpur, Malaysia

Keywords:

Skyline query processing, Graph Neural Networks (GNNs), incomplete data, Pareto optimality, ISkyline, multi-criteria decision making, data imputation, machine learning, query optimization, scalability

Abstract

Skyline query processing is essential in multi-criteria decision-making, as it retrieves optimal results without requiring user-defined weights. Traditional skyline methods, however, face significant challenges when applied to large-scale and incomplete datasets. This study proposes a hybrid approach that integrates the ISkyline dominance graph technique with Graph Neural Networks (GNNs) to improve skyline query performance under such conditions. The GNN component is utilized to predict skyline tuples in the presence of missing or incomplete data. Evaluation on both synthetic and real-world datasets demonstrates enhanced accuracy and efficiency when compared to established methods such as ISkyline, SIDS, and OIS. This work demonstrates the potential of creating a more efficient query processing, supporting applications in e-commerce, finance, and smart data systems, while aligning with the 9th Sustainable Development Goal on industry, innovation, and infrastructure

References

Y. Wang, Z. Shi, J. Wang, L. Sun, and B. Song, “Skyline preference query based on massive and incomplete dataset,” IEEE Access, vol. 5, pp. 3183–3192, 2017, doi: 10.1109/ACCESS.2016.2639558.

M. E. Khalefa, M. F. Mokbel, and J. J. Levandoski, “Skyline query processing for incomplete data,” in Proc. IEEE Int. Conf. Data Eng. (ICDE), 2008, doi: 10.1109/ICDE.2008.4497464.

R. Bharuka and P. S. Kumar, “Finding superior skyline points from incomplete data,” in Proc. Int. Conf., 2013, pp. 35–44, doi: 10.5555/2694476.2694488.

H. Afifi et al., “Machine learning with computer networks: Techniques, datasets and models,” IEEE Access, early access, 2024, doi: 10.1109/ACCESS.2024.3384460.

M. A. Mohamud et al., “A systematic literature review of skyline query processing over data stream,” IEEE Access, vol. 11, pp. 72813–72835, 2023, doi: 10.1109/ACCESS.2023.3295117.

S. Börzsönyi, D. Kossmann, and K. Stocker, “The skyline operator,” in Proc. IEEE Int. Conf. Data Eng. (ICDE), 2001, doi: 10.1109/ICDE.2001.914855.

J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with presorting,” in Proc. IEEE Int. Conf. Data Eng. (ICDE), 2004, doi: 10.1109/ICDE.2003.1260846.

D. Kossmann, F. Ramsak, and S. Rost, “Shooting stars in the sky: An online algorithm for skyline queries,” in Proc. 28th Int. Conf. Very Large Data Bases (VLDB), 2002, pp. 275–286.

K.-L. Tan, P.-K. Eng, and B. C. Ooi, “Efficient progressive skyline computation,” in Proc. Int. Conf., 2001, pp. 301–310.

G. B. Dehaki, H. Ibrahim, A. A. Alwan, F. Sidi, and N. I. Udzir, “Efficient skyline computation over an incomplete database with changing states and structures,” IEEE Access, vol. 9, pp. 88699–88723, 2021, doi: 10.1109/ACCESS.2021.3090171.

D. Luc, “Pareto optimality,” in Springer Handbook, Springer, New York, 2008, pp. 481–515.

P. Veli?kovi?, “Everything is connected: Graph neural networks,” Curr. Opin. Struct. Biol., vol. 79, Art. no. 102538, 2023, doi: 10.1016/j.sbi.2023.102538.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

R. Bharuka and P. S. Kumar, “Finding skylines for incomplete data,” in Proc. Int. Conf., 2013, pp. 109–117.

Y. Gulzar et al., “IDSA: An efficient algorithm for skyline queries computation on dynamic and incomplete data with changing states,” IEEE Access, vol. 9, pp. 57291–57310, 2021.

J. He and X. Han, “Efficient skyline computation on massive incomplete data,” Data Sci. Eng., vol. 7, no. 2, pp. 102–119, 2022, doi: 10.1007/s41019-022-00183-7.

W.-T. Balke, U. Güntzer, and J. X. Zheng, “Efficient distributed skylining for web information systems,” in Proc. Int. Conf. Extending Database Technology (EDBT), 2004, pp. 256–273.

A. Hevner, S. March, J. Park, and S. Ram, “Design science in information systems research,” MIS Q., vol. 28, no. 1, pp. 75–105, 2004, doi: 10.2307/25148625.

J. vom Brocke, A. Hevner, and A. Maedche, “Introduction to design science research,” in Progress in IS, 2020, pp. 1–13, doi: 10.1007/978-3-030-46781-4_1.

A. Hevner and S. Chatterjee, “Design science research in information systems,” in Design Research in Information Systems, Springer, Boston, MA, USA, 2010, pp. 9–22.

P. Putten, “Insurance company benchmark (COIL 2000) [Dataset],” UCI Machine Learning Repository, 2000. [Online]. Available: https://doi.org/10.24432/C5630S

Basketball-Reference. [Online]. Available:

https://www.basketball-reference.com

GroupLens. [Online]. Available:

https://grouplens.org/datasets/movielens

Downloads

Published

30-07-2025

How to Cite

Adzman, H. K. ., Raini Hassan, & Dwi Handayani, D. O. (2025). Enhancing Skyline Query Processing on Large and Incomplete Graphs with Graph Neural Networks: A Hybrid Machine Learning Approach. International Journal on Perceptive and Cognitive Computing, 11(2), 162–172. Retrieved from https://journals.iium.edu.my/kict/index.php/IJPCC/article/view/577

Issue

Section

Articles