Xue Han, L. Hu, J. Sen, Yabin Dang, Buyu Gao, Vatche Isahagian, Chuan Lei, Vasilis Efthymiou, Fatma Özcan, A. Quamar, Ziming Huang, Vinod Muthusamy
{"title":"Bootstrapping Natural Language Querying on Process Automation Data","authors":"Xue Han, L. Hu, J. Sen, Yabin Dang, Buyu Gao, Vatche Isahagian, Chuan Lei, Vasilis Efthymiou, Fatma Özcan, A. Quamar, Ziming Huang, Vinod Muthusamy","doi":"10.1109/SCC49832.2020.00030","DOIUrl":null,"url":null,"abstract":"Advances in the adoption of business process management platforms have led to increasing volumes runtime event logs, containing information about the execution of the process. Business users analyze this event data for real-time insights on performance and optimization opportunities. However, querying the event data is difficult for business users without knowing the details of the backend store, data schema, and query languages. Consequently, the business insights are mostly limited to static dashboards, only capturing predefined performance metrics. In this paper, we introduce an interface for business users to query the business event data using natural language, without knowing the exact schema of the event data or the query language. Moreover, we propose a bootstrapping pipeline, which utilizes both event data and business domain-specific artifacts to automatically instantiate the natural language interface over the event data. We build and evaluate our prototype over datasets from both practical projects and public challenge events data stored in Elasticsearch. Experimental results show that our system produces an average accuracy of 80% across all data sets, with high precision ( 91%) and good recall ( 81%).","PeriodicalId":274909,"journal":{"name":"2020 IEEE International Conference on Services Computing (SCC)","volume":"44 10","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Services Computing (SCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCC49832.2020.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Advances in the adoption of business process management platforms have led to increasing volumes runtime event logs, containing information about the execution of the process. Business users analyze this event data for real-time insights on performance and optimization opportunities. However, querying the event data is difficult for business users without knowing the details of the backend store, data schema, and query languages. Consequently, the business insights are mostly limited to static dashboards, only capturing predefined performance metrics. In this paper, we introduce an interface for business users to query the business event data using natural language, without knowing the exact schema of the event data or the query language. Moreover, we propose a bootstrapping pipeline, which utilizes both event data and business domain-specific artifacts to automatically instantiate the natural language interface over the event data. We build and evaluate our prototype over datasets from both practical projects and public challenge events data stored in Elasticsearch. Experimental results show that our system produces an average accuracy of 80% across all data sets, with high precision ( 91%) and good recall ( 81%).