{"title":"DDM: Data-Driven Modeling of Physical Phenomenon with Application to METOC","authors":"S. Rubin, Lydia Bouzar-Benlabiod","doi":"10.1109/IRI.2019.00057","DOIUrl":null,"url":null,"abstract":"The problem addressed by this paper pertains to the representation, acquisition, and randomization of experiential knowledge for autonomous systems in expert reconnaissance. Such systems are characterized by the requirement to render proper decisions not explicitly programmed for. Cases are defined to consist of domain-specific data (e.g., heterogeneous sensory data), which may not be fully general due to the inclusion of (a) extraneous predicates and/or because (b) the predicates are overly specific. Rules satisfy the definition of cases and result from cases (rules), which have undergone at least one of the aforementioned generalizations. Extraneous antecedent predicates may be discovered from cases (rules) sharing a common consequent, if binary tautologies are found in case (rule) pairings, or if higher tautologies are found in a multiplicity of such cases (rules). Eliminating such extraneous antecedent predicates allows for the discovery of possible additional extraneous antecedent predicates - where the antecedent of one is a proper subset of the other. Candidate rules are formed from the intersection of combinations of two or more case (rule) antecedent sets implying a common consequent. The removed antecedent subsets are acquired as new rules implying the common consequent, which are conditioned to fire by the non-monotonic actions of their common antecedent (i.e., by way of an embedded antecedent predicate) - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. Similarly, more general consequent sequences are formed from common subsequences shared by two or more consequent sequences being non-deterministically implied by a common antecedent. The removed consequent subsequences are acquired as new rules, which are set to fire before or after that of its parent's common dependency - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. The rule to fire first will non-monotonically trigger the rule to fire next. This process iterates, since randomization of one side may enable further randomization of the other side. Tautologies are extracted and common subsets or subsequences form candidate rules as previously described (i.e., without creating duplicate productions). The context for the transformations is provided by the cases (rules), which are effectively acquired as previously described. Knowledge is segmented on the basis of whether it is a case, or a rule. Knowledge is further dynamically segmented on the basis of maximally shared left-hand sides (LHS) and maximally shared right-hand sides (RHS) - using logical pointers to minimize space-time requirements. It is proven that the allowance for non determinism is required, which implies that candidate rules cannot be invalidated by syntactically checking them for contradiction with a known valid dependency. A possibility metric is provided for each production, which cumulatively tracks the similarity of the context and a selected production's situation. Each context may be associated with a minimum possibility metric in order that no production creating a lesser metric may fire. If the application of a production(s) is deemed to be unsuccessful and the production(s) are deemed not to be erroneous, then a correct case is preferentially acquired, where available (i.e., in lieu of rule deletion - by default), which will always fire in lieu of the erroneous production(s) on the given context, since it is more specific, by definition (i.e., using a most-specific-first inference engine). Random and symmetric search are integrated to insure broad coverage of the search space. The (transformed) context may be fuzzily matched to a situation, which it does not cover. Not only does this allow for the generation of questions to confirm the missing predicate information, but provides for the abduction of a possible response as well. Cases (rules) are stored in segments so as to maximize their coherency across parallel processors. Less useful knowledge is expunged in keeping with this policy. Segments or processors are subdivided 2 into local groups to minimize contention on the bus and memory in a massively parallel architecture. Acyclic transformations serve to provide a metaphorical explanation for the inference engine upon request. Finally, examples of the methodology, as applied to METOC modeling, are provided throughout. A high level of intelligence is consistent with the theory of randomization, which is fundamental for the development of truly autonomous systems for use in domains ranging from meteorology to C4ISR.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2019.00057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The problem addressed by this paper pertains to the representation, acquisition, and randomization of experiential knowledge for autonomous systems in expert reconnaissance. Such systems are characterized by the requirement to render proper decisions not explicitly programmed for. Cases are defined to consist of domain-specific data (e.g., heterogeneous sensory data), which may not be fully general due to the inclusion of (a) extraneous predicates and/or because (b) the predicates are overly specific. Rules satisfy the definition of cases and result from cases (rules), which have undergone at least one of the aforementioned generalizations. Extraneous antecedent predicates may be discovered from cases (rules) sharing a common consequent, if binary tautologies are found in case (rule) pairings, or if higher tautologies are found in a multiplicity of such cases (rules). Eliminating such extraneous antecedent predicates allows for the discovery of possible additional extraneous antecedent predicates - where the antecedent of one is a proper subset of the other. Candidate rules are formed from the intersection of combinations of two or more case (rule) antecedent sets implying a common consequent. The removed antecedent subsets are acquired as new rules implying the common consequent, which are conditioned to fire by the non-monotonic actions of their common antecedent (i.e., by way of an embedded antecedent predicate) - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. Similarly, more general consequent sequences are formed from common subsequences shared by two or more consequent sequences being non-deterministically implied by a common antecedent. The removed consequent subsequences are acquired as new rules, which are set to fire before or after that of its parent's common dependency - reducing the specificity of the parents by generalizing them into smaller, more reusable rules. The rule to fire first will non-monotonically trigger the rule to fire next. This process iterates, since randomization of one side may enable further randomization of the other side. Tautologies are extracted and common subsets or subsequences form candidate rules as previously described (i.e., without creating duplicate productions). The context for the transformations is provided by the cases (rules), which are effectively acquired as previously described. Knowledge is segmented on the basis of whether it is a case, or a rule. Knowledge is further dynamically segmented on the basis of maximally shared left-hand sides (LHS) and maximally shared right-hand sides (RHS) - using logical pointers to minimize space-time requirements. It is proven that the allowance for non determinism is required, which implies that candidate rules cannot be invalidated by syntactically checking them for contradiction with a known valid dependency. A possibility metric is provided for each production, which cumulatively tracks the similarity of the context and a selected production's situation. Each context may be associated with a minimum possibility metric in order that no production creating a lesser metric may fire. If the application of a production(s) is deemed to be unsuccessful and the production(s) are deemed not to be erroneous, then a correct case is preferentially acquired, where available (i.e., in lieu of rule deletion - by default), which will always fire in lieu of the erroneous production(s) on the given context, since it is more specific, by definition (i.e., using a most-specific-first inference engine). Random and symmetric search are integrated to insure broad coverage of the search space. The (transformed) context may be fuzzily matched to a situation, which it does not cover. Not only does this allow for the generation of questions to confirm the missing predicate information, but provides for the abduction of a possible response as well. Cases (rules) are stored in segments so as to maximize their coherency across parallel processors. Less useful knowledge is expunged in keeping with this policy. Segments or processors are subdivided 2 into local groups to minimize contention on the bus and memory in a massively parallel architecture. Acyclic transformations serve to provide a metaphorical explanation for the inference engine upon request. Finally, examples of the methodology, as applied to METOC modeling, are provided throughout. A high level of intelligence is consistent with the theory of randomization, which is fundamental for the development of truly autonomous systems for use in domains ranging from meteorology to C4ISR.