Efficient Skyline Query Processing in Incomplete Graph Databases Using Machine Learning Techniques
DOI:
https://doi.org/10.31436/ijpcc.v11i2.595Keywords:
Skyline queries,, Incomplete graph database, Machine learning, Graph databaseAbstract
Skyline queries play a critical role in multi-criteria decision-making systems by retrieving non-dominated data points from large datasets. In recent years, the rapid growth of graph-structured data across various domains has introduced challenges in efficiently processing skyline queries over incomplete and large-scale graph databases. Processing skyline queries in such massive, incomplete graphs is computationally intensive due to missing values and high-dimensional data. Traditional techniques often fail to scale or effectively handle data imperfections. There is a pressing need for a scalable, intelligent framework that can manage missing data, reduce computational overhead, and improve skyline query efficiency. This study adopts the Design Science Research Methodology (DSRM) to design and implement an optimisation framework that integrates machine learning techniques, including domination score ranking, dimension-based filtering, K-Means clustering and quicksort. These methods collectively reduce the search space and redundant comparisons. Experimental evaluation on real graph datasets demonstrates significant improvements in skyline computation time and accuracy, with clear reductions in pairwise comparisons and improved processing efficiency on large-scale graphs. By leveraging machine learning techniques for sorting, filtering and clustering, the approach reduces computational complexity and enhances scalability. These results show promising directions for applying intelligent query optimization in big data environments.
References
S. Börzsönyi, D. Kossmann, and K. Stocker, “The skyline operator,” in Proceedings - International Conference on Data Engineering, 2001, pp. 421–430. doi: 10.1109/icde.2001.914855.
Y. Gulzar, “Skyline Query Approaches In Static And Dynamic Incomplete Databases,” 2018.
Y. Gulzar, A. A. Alwan, and S. Turaev, “Optimizing Skyline Query Processing in Incomplete Data,” IEEE Access, vol. 7, pp. 178121–178138, 2019, doi: 10.1109/ACCESS.2019.2958202.
Y. Gulzar, A. A. Alwan, and S. Turaev, “Optimizing Skyline Query Processing in Incomplete Data,” IEEE Access, vol. 7, pp. 178121–178138, 2019, doi: 10.1109/ACCESS.2019.2958202.
D. Amr and N. El-Tazi, “Skyline Query Processing in Graph Databases,” Academy and Industry Research Collaboration Center (AIRCC), Jul. 2018, pp. 49–57. doi: 10.5121/csit.2018.81005.
K. Abbaci, A. Hadjali, L. Liétard, and D. Rocacher, “A similarity skyline approach for handling graph queries - A preliminary report,” in Proceedings - International Conference on Data Engineering, 2011, pp. 112–117. doi: 10.1109/ICDEW.2011.5767617.
W. Zheng, L. Zou, X. Lian, L. Hong, and D. Zhao, “Efficient subgraph skyline search over large graphs,” in CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, Association for Computing Machinery, Nov. 2014, pp. 1529–1538. doi: 10.1145/2661829.2662037.
A. Alwan, H. Ibrahim, N. Udzir, and F. Sidi, “Missing values estimation for skylines in incomplete database,” International Arab Journal of Information Technology, vol. 15, no. 1, pp. 66–75, 2018.
L. Zou, L. Chen, M. Tamer¨ozsu, T. Tamer¨ozsu, and D. Zhao, “Dynamic Skyline Queries in Large Graphs.”
D. Ouyang, L. Yuan, F. Zhang, L. Qin, and X. Lin, “Towards efficient path skyline computation in bicriteria networks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Verlag, 2018, pp. 239–254. doi: 10.1007/978-3-319-91452-7_16.
X. Zhu, J. Wu, W. Chang, G. Wang, and Q. Liu, “Authentication of skyline query over road networks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Verlag, 2018, pp. 72–83. doi: 10.1007/978-3-030-05345-1_6.
X. Miao, Y. Gao, S. Guo, and G. Chen, “On Efficiently Answering Why-Not Range-Based Skyline Queries in Road Networks,” IEEE Trans Knowl Data Eng, vol. 30, no. 9, pp. 1697–1711, Sep. 2018, doi: 10.1109/TKDE.2018.2803821.
S. Banerjee, B. Pal, and M. Jenamani, “DySky: Dynamic Skyline Queries on Uncertain Graphs,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Science and Business Media Deutschland GmbH, 2020, pp. 242–254. doi: 10.1007/978-3-030-62005-9_18.
Y. Gulzar et al., “IDSA: An Efficient Algorithm for Skyline Queries Computation on Dynamic and Incomplete Data with Changing States,” IEEE Access, vol. 9, pp. 57291–57310, 2021, doi: 10.1109/ACCESS.2021.3072775.
L. Ding, G. Zhang, J. Ma, and M. Li, “An Efficient Index-Based Method for Skyline Path Query over Temporal Graphs with Labels,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Science and Business Media Deutschland GmbH, 2023, pp. 217–233. doi: 10.1007/978-3-031-30675-4_15.
Y. Gulzar, A. A. Alwan, R. M. Abdullah, Q. Xin, and M. B. Swidan, “SCSA: Evaluating skyline queries in incomplete data,” Applied Intelligence, vol. 49, no. 5, pp. 1636–1657, May 2019, doi: 10.1007/s10489-018-1356-2.
I. Keles and K. Hose, “Skyline Queries over Knowledge Graphs,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, 2019, pp. 293–310. doi: 10.1007/978-3-030-30793-6_17.
P. Kumar Sadineni, “Comparative study on skyline query processing techniques on big data,” in Proceedings of the 4th International Conference on IoT in Social, Mobile, Analytics and Cloud, ISMAC 2020, 2020, pp. 1045–1050. doi: 10.1109/I-SMAC49090.2020.9243343.
A.-T. Kuo, H. Chen, L. Tang, W.-S. Ku, and X. Qin, “ProbSky: Efficient Computation of Probabilistic Skyline Queries Over Distributed Data,” IEEE Trans Knowl Data Eng, vol. 35, no. 5, pp. 5173–5186, 2023, doi: 10.1109/TKDE.2022.3151740.
Y. Shu, J. Zhang, W. E. Zhang, D. Zuo, and Q. Z. Sheng, “IQSrec: An Efficient and Diversified Skyline Services Recommendation on Incomplete QoS,” IEEE Trans Serv Comput, vol. 16, no. 3, pp. 1934–1948, 2023, doi: 10.1109/TSC.2022.3189503.
D. Yuan, L. Zhang, S. Li, and G. Sun, “Skyline query under multidimensional incomplete data based on classification tree,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-024-00923-8.
D. Yuan, L. Zhang, S. Li, and G. Sun, “skyline query under multidimensional incomplete data based on classification tree,” 2024, doi: 10.21203/rs.3.rs-3915982/v1.
A. Hevner, “A Three Cycle View of Design Science Research,” 2014. [Online]. Available: https://www.researchgate.net/publication/254804390
Godfrey, “Maximal vector computation in large data sets,” 2005.
J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with Presorting: Theory and Optimizations.”
J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with Presorting,” 2002.
Mohamed E. Khalefa, Skyline query Processing for incomplete Data. IEEE Xplore, 2008.
X. Miao, Y. Gao, S. Guo, L. Chen, J. Yin, and Q. Li, “Answering Skyline Queries over Incomplete Data with Crowdsourcing,” IEEE Trans Knowl Data Eng, vol. 33, no. 4, pp. 1360–1374, Apr. 2021, doi: 10.1109/TKDE.2019.2946798.

