The availability of data repositories simplified the evaluation of some types of learning systems for some types of learning tasks, such as classification. Unfortunately, when these repositories succeeded, they encouraged an overemphasis on studies involving isolated data sets, to the detriment of studies on embedded learning systems. For example, comparatively few researchers have focused on examining the utility of learning in the context of cognitive systems, in part because they are more challenging to conduct (e.g., more costly, more time-consuming, less understood). However, embedded investigations of this type hold great promise for the research community by providing a platform for investigating interesting research topics (e.g., constraining the outputs of learning systems, real-time learning, learning in the context of other reasoning behaviors and huge amounts of background knowledge) and as a means to collaborate with other cognitive systems researchers. To support these efforts, we are developing a testbed that facilitates the study of learning in a subset of cognitive systems, namely gaming simulators. In this talk, I will describe our initial progress and its relationship to other efforts.
We describe an integrated acquisition interface that includes several techniques to support users in various ways as they add new knowledge to an intelligent system. As a result of this integration, the individual techniques can take better advantage of the context in which they are invoked and provide stronger guidance to users. We describe our current implementation using examples from a travel planning domain, and demonstrate how users can add complex problem-solving knowledge to the system.
This talk describes joint work with Jihie Kim, Surya Ramachandran and Yolanda Gil.
Relevant papers:
Blythe, J., Kim, J., Ramachandran, S., & Gil, Y. (2001). An integrated environment for knowledge acquisition. Proceedings of the International Conference on Intelligent User Interfaces (pp. 13-20). Santa Fe, NM.
Blythe, J. (2001). Integrating expectations to support end users to acquire procedural knowledge. Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (pp. 943-952). Seattle, WA.
Current general-purpose planners use powerful domain-independent
search heuristics to generate solutions for problems in a variety of
domains. However, in some situations these heuristics force the
planner to perform inefficiently or obtain solutions of poor quality.
Learning from problem-solving episodes can help to identify the
particular situations for which the domain-independent heuristics need
to be overridden. I will overview some work on how to learn control
knowledge to improve the planner's performance and solution quality by
using incremental and non-incremental methods. I will discuss
advantages and disadvantages of each approach for this learning task,
and also present recent work on related issues, such as combining
different
sources of control knowledge.
Relevant papers:
Aler, R., Borrajo, D., & Isasi, P. (2002). Using genetic programming to learn and improve control knowledge. Artificial Intelligence, 141, 29-56
Borrajo, D., & Veloso, M. (1995). Lazy incremental learning of control knowledge for efficiently obtaining quality plans. AI Review Journal, 11, 371-405.
Aler, R., & Borrajo, D. (2002). On control knowledge acquisition by exploiting human-computer interaction. In M. Ghallab, J. Hertzberg, & P. Traverso (Eds.), Proceedings of the Sixth International Conference on Artificial Intelligence Planning Systems (pp. 112-120). Toulouse, France: AAAI Press
Many large-scale, real-world problems are readily represented, solved, and understood as constraint-satisfaction problems. Although constraint programming offers a wealth of good, general-purpose methods to solve many real-world problems, each new, large-scale constraint problem faces the same bottleneck: difficult constraint programming problems need expert constraint programmers. Their solution remains more an art form than an automated process, in part because the interactions among existing methods are not well understood. Moreover, there is increasing evidence that different classes of constraint-satisfaction problems respond best to different heuristics. We are developing a program, ACE, that can support constraint programmers by learning good combinations of heuristics for a particular class of problems specified by the user. ACE can support a novice constraint programmer in the selection of heuristics, it can learn new, efficient heuristics previously unidentified by experts but readily used in other programming environments, and it can learn heuristics for problem classes that do not respond well to the ordinary, off-the-shelf approaches.
Relevant papers:
Epstein, S. L. (2004). Metaknowledge for autonomous systems. Working Notes of the AAAI Spring Symposium on Knowledge Representation and Ontology for Autonomous Systems. Menlo Park, CA: AAAI Press.
Epstein, S. L., Freuder, E. C., Wallace, R., Morozov, A., & Samuels, B. (2002). The adaptive constraint engine. Principles and Practice of Constraint Programming - CP2002, 2470. Berlin: Springer Verlag.
Epstein, S. L., & Freuder, E. C. (2001). Collaborative learning for constraint solving. Principles and Practice of Constraint Programming - CP2001, 2239. Berlin: Springer Verlag.
Our working hypothesis is that the flexibility and breadth of human common sense reasoning and learning arises from analogical reasoning and learning from experience. This hypothesis suggests a very different approach to building robust cognitive software than is typically proposed. Reasoning and learning by analogy are central, rather than exotic operations undertaken only rarely. Accumulating and refining examples becomes central to building systems that can learn and adapt. One way we are exploring this hypothesis is by developing Companion Cognitive Systems, a new architecture for software that can be effectively treated as a collaborator. Our goal is for Companions to be capable of operating for weeks and months at a time, continually adapting and learning about the domains they are working in, their users, and themselves. Companions will be implemented as a collection of agents, running on a cluster, so that, for instance, analogical retrieval of relevant precedents proceeds entirely in parallel with other reasoning processes, such as the visual processing involved in understanding a user's sketched input. This talk will discuss our approach to analogical processing and our work in progress on creating the first Companions.
Relevant papers:
Forbus, K., & Gentner, D. (1997). Qualitative mental models: Simulations or memories? Proceedings of the Eleventh International Workshop on Qualitative Reasoning (pp. 97-104). Cortona, Italy.
Kuehne, S., Gentner, D., & Forbus, K. (2000). Modeling infant learning via symbolic structural alignment. Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society. Philadelphia, PA.
Forbus, K. (2001). Exploring analogy in the large. In D. Gentner, K. Holyoak, & B. Kokinov (Eds.), Analogy: Perspectives from cognitive Science. Cambridge, MA: MIT Press.
I will describe two systems that mix automatic and deliberative learning processes in order to acquire and tune problem-solving knowledge. Cascade provides a model of the "self-explanation" and "fading" effects in human problem solving. Explaining these effects relies in part on the interaction of an automatic mechanism, called analogical search control, that continuously stores goal-oriented problem-solving traces to guide search on future similar problems, and a "deliberative" mechanism, called explanation-based learning of correctness, that lets the system investigate and acquire new problem-solving rules when faced with an impasse. GIPS provides a model of problem-solving strategy invention for children learning to add. Again, the model depends on interacting mechanisms for automatic knowledge tuning (using updates of weights for Bayesian combinations of features) and deliberative changes to knowledge structures (reflecting on potential changes to the preconditions of operators). I will provide an overview of both of these systems, with an emphasis on the importance of the interacting learning mechanisms. I will also discuss some recent research using Cascade to explain why and how human acquire problem-solving knowledge more effectively in some situations than others.
Relevant papers:
Jones, R. M. & Fleischman, E. S. (2001). Cascade explains and informs the utility of fading examples to problems. Proceedings of the Twenty-Third Annual Conference of the Cognitive Science Society, 459-464. Hillsdale, NJ: Lawrence Erlbaum.
Jones, R. M., & VanLehn, K. (1994). Acquisition of children's addition strategies: A model of impasse-free, knowledge-level learning. Machine Learning, 16, 11-36.
VanLehn, K., & Jones, R. M. (1993). Integration of explanation-based learning of correctness and analogical search control. In S. Minton (Ed.), Machine learning methods for planning. Los Altos, CA: Morgan Kaufmann.
In this introductory talk, I review the history of machine learning and its role in the computational study of reasoning and problem solving, including the causes for the decline of work on this topic and its importance in developing integrated cognitive systems.
Many potential applications for agent technology require humans and agents to work together to achieve complex tasks effectively. In contrast, most of the work in the agents community to date has focused on technologies for fully autonomous agent systems. We describe a framework for the directability of agents, in which a human supervisor can define policies to influence agent activities at execution time. The framework focuses on the concepts of adjustable autonomy for agents (i.e., varying the degree to which agents make decisions without human intervention) and strategy preference (i.e., recommending how agents should accomplish assigned tasks). These mechanisms constitute a form of 'learning by being told' that enable a human to customize the operations of agents to suit individual preferences and situation dynamics, leading to improved system reliability and increased user confidence over fully automated agent systems.
Relevant papers:
Myers, K. L., & Morley, D. N. (in press). Policy-based agent directability. In H. Hexmoor, R. Falcone, & C. Castelfranchi (Eds.), Adjustable autonomy. Boston: Kluwer Academic Publishers.
Myers, K. L., & Morley, D. N. (2002). Resolving conflicts in agent guidance. Proceedings of the AAAI-02 Workshop on Preferences in AI and CP: Symbolic Approaches.
Systems that learn problem-solving strategies typically assume full knowledge of an operator's effects, but this may not be available for some domains. In this talk, I will discuss, IMPROV, a system that instead requires only the ability to execute its operators. An IMPROV agent's plans are represented as rule sets that efficiently guide the agent in making local decisions during execution. Learning occurs during plan execution whenever the agent's knowledge is insufficient to determine the next action to take, producing a method that is more appropriate in stochastic environments than direct plan monitoring. IMPROV's method for correcting domain knowledge focuses on correcting operator preconditions. This involves generating and executing alternative plans in decreasing order of expected likelihood of reaching the current goal. Once it has discovered a successful plan, IMPROV uses an inductive learning module to correct the preconditions of the operators used in these plans. Learning is based on the last k instances, which provides the benefits of incremental improvement and better credit assignment than traditional reinforcement learning. Actions are corrected by recursively re-using the precondition correction method. The agent's domain knowledge is encoded as a hierarchy of operators of progressively smaller grain size. Incorrect actions at higher levels are corrected by changing the preconditions of the sub-operators which implement them. This lets IMPROV learn complex actions with durations and conditional effects. I will also briefly discuss a new system, Redux, that uses diagrams specified by an expert to construct a library of training examples that support rapid acquisition of task knowledge.
Relevant papers:
Pearson, D. J., & Laird, J. E. (1999). Toward incremental knowledge correction for agents in complex environments. Machine Intelligence 15. Oxford University Press.
Pearson, D. J. (1996). Learning procedural planning knowledge in complex environments. EECS Department, University of Michigan, Ann Arbor.
Pearson, D. J., & Laird, J. E. (2003). Example-driven diagrammatic tools for rapid knowledge acquisition. Proceedings of the K-CAP 2003 Workshop on Visualizing Information in Knowledge Engineering. Sanibel Island, FL.
I will discuss METAGAMER, the first program designed within the paradigm of meta-game playing or "metagame" for short. This program plays metagame in the class of symmetric chess-like games, which includes chess, Chinese-chess, checkers, draughts, and Shogi. METAGAMER takes as input the rules of a specific game and analyses those rules to construct for that game an efficient representation and an evaluation function, for use by a generic search engine. The strategic analysis performed by METAGAMER relates a set of general knowledge sources to the details of the particular game. Among other properties, this analysis determines the relative value of the different pieces in a given game. Based on this strategic analysis, the values resulting from its analysis are qualitatively similar to values used by experts on known games, and are sufficient to produce competitive performance the first time METAGAMER actually plays each new game. Learning for a new game can then take place at the level of strategic tradeoffs, or by improving on the initially generated game-specific values. Besides being the first metagame-playing program, this was the first program to have derived useful piece values directly from analysis of the rules of different games. This talk describes the knowledge implemented in METAGAMER, illustrates the piece values METAGAMER derives for chess and checkers, and discusses experiments with METAGAMER on both existing and newly generated games. I will conclude with some thoughts about the implications for building agents that need to learn and plan in more general domains.
Relevant papers:
Pell, B. (1993). Strategy generation and evaluation for meta-game playing. Doctoral dissertation, University of Cambridge, Cambridge, England.
Pell, B. (1993). Logic programming for general game-playing. Proceedings of the ICML-93 Workshop on Knowledge Compilation and Speedup Learning. Amherst, MA.
Pell, B. (1994). A strategic metagame player for general chess-like games. Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 1378-1385). Seattle, WA: AAAI Press.
We have developed a process model that learns in diffferent ways while finding faults in a simple control panel device with multiple parts. A systematic comparison of the model's learning behavior was made with data on human learning with regard to the time course and specific sequence of behaviors. This comparison shows that the model accounts very well for measures such as problem0solving strategy, the relative difficulty of faults, and average fault-finding time. Most importantly, because the model learns, it also matches the speed up due to learning when examined across participants, faults, and trials on an individual participant basis. In this way, it implements a theory of transfer within multi-step diagnostic problem solving. However, participants tended to take longer than predicted to find a fault the second time they completed a task. To examine this effect, we compared the model's sequential predictions with a participant solving five tasks. We found that the poorer predictions for seeing a fault the second time may be explained by the time the participant spent reflecting on or checking his work while problem solving. The sequential analysis reminds us that, although aggregate measures can be well matched by a model, the underlying processes that generate these predictions might still differ.
This talk reports work done jointly with Peter Bibby.
Relevant papers:
Ritter, F. E., & Bibby, P. (2001). Modeling how and when learning happens in a simple fault-finding task. Proceedings of the Fourth International Conference on Cognitive Modeling (pp. 187-192). Mahwah, NJ: Lawrence Erlbaum.
Baxter, G. D., & Ritter, F. E. (1995). The Soar FAQ.
Domain elements in real applications are best described as structured
elements, consisting of sets of objects with attributes and relations
between them. We will describe our research on representing, learning
and inference with structured representations, and exemplify it with
some of our work in natural language processing. The focus will be on
a knowledge representation framework -- feature description logics
-- that can be used to describe structured domain elements, to extract
expressive features from it for the purpose of learning these, and to
reason with structured elements.
Human reasoning is almost always constrained by time demands from the external world. In situations like driving a car, air-traffic control, and real-time games, decisions must be made within a certain time frame or otherwise the opportunity is lost. In order to take split-second decisions in time-critical situations, the cognitive resources must be used as well as possible by parallelizing cognition, perception, and motor systems. Although there already exist many models of expert behavior in these situations, I will address the question how this optimal behavior can be learned. I will show that two components of fast behavior, parallelizing cognitive resources and learning expert rules, can be achieved by a single learning mechanism in ACT-R and demonstrate this claim on two tasks. The first involves the game of Set, in which players try to identify before others a set of three cards from twelve cards on the table that satisfy a certain criterion. In this game, rapid reasoning is is important to beat the other players. The second domain is CMU-ASP, a complex dynamic task in which the participants must find and identify airplanes on a radar screen. Expert behavior is characterized by the ability to decide quickly which plane to attend next, and to integrate this decision making with perceptual and motor actions.
Relevant papers:
Taatgen, N.A., van Oploo, M., Braaksma, J. & Niemantsverdriet, J. (2003). How to construct a believable opponent using cognitive modeling in the game of Set. Proceedings of the Fifth International Conference on Cognitive Modeling (pp. 201-206). University of Bamberg.
Taatgen, N.A. & Lee, F.J. (2003). Production compilation: A simple mechanism to model complex skill acquisition. Human Factors, 45, 61-76.
Play Set against ACT-R by dowloading the game (for Mac OS X) from http://www.ai.rug.nl/~niels/set-app/.
Despite the long history of machine learning research in relational problem-solving domains, the emphasis in recent years has shifted away from expressive representations of knowledge and powerful problem-solving architectures. In this talk, which is based on joint work with Chandra Reddy, I consider learning in the context of a familiar problem-solving architecture based on hierarchical decomposition of goals into sequences of subgoals. First, I describe a relational learning method that learns hierarchical goal decomposition rules from externally observed sequences of low level actions. The challenge here is to automatically induce the subgoals from multiple solutions, since the subgoals are not directly apparent in the solutions. Next, I describe a method that learns from "exercises" -- problems of varying difficulty that are solved by the learner itself. Solving the easier problems first would enable learning of rules that help solve the harder problems more easily, an idea referred to as ``shaping.'' The two methods share some features such as inductive generalization and explanation-based pruning of examples, but also differ in some crucial ways because of their different perspectives on the learning problem. I illustrate the methods with examples in two domains and present empirical results on their effectiveness.
Reddy, C., & Tadepalli, P. (1999). Learning Horn definitions: Theory and an application to planning. New Generation Computing, 17, 77-98.
Reddy, C., & Tadepalli, P. (1997). Learning goal-decomposition rules using exercises. Proceedings of the Fourteenth International Conference on Machine Learning (pp. 278-286). Nashville, TN.
Reddy, C., Tadepalli, P., & Roncagliolo, S. (1996). Theory-guided empirical speedup learning of goal decomposition rules. Proceedings of the Thirteenth International Conference on Machine Learning (pp. 409-417). Bari, Italy.
Over the years we have developed the Disciple theory, methodology, and family of tools for building knowledge-based agents. The main approach involves developing an agent shell that can be taught directly by a subject matter expert in a way that resembles how the expert would teach a human apprentice when solving problems in cooperation. This talk presents and demonstrates the most recent version of the Disciple framework, which is based on methods for mixed-initiative problem solving (where the expert shows the agent how to solve specific problems and critiques the agent's problem solving attempts), integrated teaching and learning (where the agent helps the expert teach it, by asking relevant questions, and the expert helps the agent to learn, by providing examples, hints, and explanations), and multistrategy learning (where the agent uses multiple strategies, such as learning from examples, from explanations, and by analogy, to acquire general concepts and rules). The Disciple approach has been applied successfully in building agents for several complex, real-world domains, including workaround planning, course of action critiquing, and center of gravity analysis.
Relevant papers:
Tecuci, G., Boicu, M., Marcu, D., Stanescu, B., Boicu, C., & Comello, J. (2002). Training and using Disciple agents: A case study in the military center of gravity analysis domain. AI Magazine.
Tecuci, G., Boicu, M., Bowman, M., & Marcu, D. (2001). An innovative application from the DARPA knowledge bases programs: Rapid development of a high performance knowledge base for course of action critiquing. AI Magazine, 22, 43-61.
Tecuci, G. (1998). Building intelligent agents: An apprenticeship multistrategy learning theory, methodology, tool, and case studies. Academic Press.
The ability to produce high quality plans, and to do so efficiently, is essential if AI planners are to be widely deployed to solve real-world problems. Previous work has shown that incorporating domain knowledge can improve both planning efficiency and plan quality. Traditionally, this knowledge is encoded as control rules that limit the search for generation of the first viable plan. An alternative approach, called planning by rewriting, instead efficiently generates a low quality plan and then uses a set of domain-specific "plan rewrite rules" to transform this plan into a higher quality one. I will review approaches for automatically learning search-control and rewrite rules, then present results experimental comparisons of the effectiveness of learned rules in improving planning performance across a number of domains.
Relevant papers:
Upal, M. A., & Elio, R. (2000). Learning search control rules versus rewrite rules to improve plan quality. Proceedings of the Thirteenth Canadian Conference on Artificial Intelligence (pp. 240-253). New York: Springer-Verlag.
Upal, M. A. (2001). Learning plan rewrite rules. Proceedings of the Fourteenth FLAIRS Conference (pp. 412-417). Menlo Park, CA: AAAI Press.