Misc,

A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases

J. Sequeda, D. Allemang, and B. Jacob.
(2023)

Abstract

Enterprise applications of Large Language Models (LLMs) hold promise for question answering on enterprise SQL databases. However, the extent to which LLMs can accurately respond to enterprise questions in such databases remains unclear, given the absence of suitable Text-to-SQL benchmarks tailored to enterprise settings. Additionally, the potential of Knowledge Graphs (KGs) to enhance LLM-based question answering by providing business context is not well understood. This study aims to evaluate the accuracy of LLM-powered question answering systems in the context of enterprise questions and SQL databases, while also exploring the role of knowledge graphs in improving accuracy. To achieve this, we introduce a benchmark comprising an enterprise SQL schema in the insurance domain, a range of enterprise queries encompassing reporting to metrics, and a contextual layer incorporating an ontology and mappings that define a knowledge graph. Our primary finding reveals that question answering using GPT-4, with zero-shot prompts directly on SQL databases, achieves an accuracy of 16%. Notably, this accuracy increases to 54% when questions are posed over a Knowledge Graph representation of the enterprise SQL database. Therefore, investing in Knowledge Graph provides higher accuracy for LLM powered question answering systems.

BibTeX key: sequeda2023benchmark
entry type: misc
year: 2023
eprint: 2311.07509
archiveprefix: arXiv
primaryclass: cs.AI
Document: https://arxiv.org/pdf/2311.07509.pdf

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@misc{sequeda2023benchmark, abstract = {Enterprise applications of Large Language Models (LLMs) hold promise for question answering on enterprise SQL databases. However, the extent to which LLMs can accurately respond to enterprise questions in such databases remains unclear, given the absence of suitable Text-to-SQL benchmarks tailored to enterprise settings. Additionally, the potential of Knowledge Graphs (KGs) to enhance LLM-based question answering by providing business context is not well understood. This study aims to evaluate the accuracy of LLM-powered question answering systems in the context of enterprise questions and SQL databases, while also exploring the role of knowledge graphs in improving accuracy. To achieve this, we introduce a benchmark comprising an enterprise SQL schema in the insurance domain, a range of enterprise queries encompassing reporting to metrics, and a contextual layer incorporating an ontology and mappings that define a knowledge graph. Our primary finding reveals that question answering using GPT-4, with zero-shot prompts directly on SQL databases, achieves an accuracy of 16%. Notably, this accuracy increases to 54% when questions are posed over a Knowledge Graph representation of the enterprise SQL database. Therefore, investing in Knowledge Graph provides higher accuracy for LLM powered question answering systems.}, added-at = {2023-11-18T09:24:48.000+0100}, archiveprefix = {arXiv}, author = {Sequeda, Juan and Allemang, Dean and Jacob, Bryon}, biburl = {https://www.bibsonomy.org/bibtex/2fe0b0072fe2262902374dfaef6b20a85/ghagerer}, eprint = {2311.07509}, interhash = {79fccd07237789b64122b2d69550b117}, intrahash = {fe0b0072fe2262902374dfaef6b20a85}, keywords = {SQL databases knowledge-graphs llms query-generation}, primaryclass = {cs.AI}, timestamp = {2023-11-18T09:24:48.000+0100}, title = {A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases}, url = {https://arxiv.org/pdf/2311.07509.pdf}, year = 2023 }

BibSonomy

A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on