IRC in Advanced Knowledge Technologies

Advanced Knowledge Technologies: AKT
Abstract Proposal for Interdisciplinary Research Collaborations
© 1999 Advanced Knowledge Technologies Consortium, 24-Sep-1999

Summary

The proposed IRC brings together a strong set of universities and complementary disciplines to tackle fundamental problems associated with the management of knowledge. The IRC will be a multi-million pound, six year collaboration to develop and extend a range of technologies to provide integrated methods and services for the capture, modelling, publishing, reuse and management of knowledge. The IRC would undertake fundamental research in particular knowledge technologies but it would also bring together relevant work and produce practical results. The outline proposal has attracted significant and enthusiastic industrial support. We believe it provides an exciting focus for research into knowledge technology.

Introduction

It is a commonly held belief that we live in a world where there has been an explosion of data, information and knowledge. But knowledge is only of value when it can be used effectively and efficiently. The management of knowledge is increasingly being recognised as a key element in extracting its value. We need to understand how best to take knowledge through a series of stages from its creation to its use. It needs to be acquired, modelled and represented, stored and retrieved, used and reused, published and maintained.

The proposed IRC is intended to address all these closely-related issues in an integrated approach. In the next section, we list six challenges which any complete approach to knowledge management must meet. We see these as fundamental bottlenecks that need to be overcome and around which our IRC research agenda is focused.

Six Challenges
Theme 1: The Challenge of Knowledge Acquisition

Although we often suffer from a surfeit of data, we still face situations where the problem is insufficient or poorly-specified knowledge. This defines the area of knowledge acquisition (KA). The KA challenge is to develop and extend our existing techniques and methods. All of the partners in the proposed IRC have a track record of innovative research in KA. For instance, the Aberdeen group have developed a number of cooperative KA and knowledge refinement systems, and Shadbolt's group, formerly based at Nottingham, now at Southampton, has a long history of KA tool design and development.

One feature of this proposal is the integration of recent developments in computational linguistics. This is now possible with the sorts of architectures being developed by the Sheffield group. KA carried out by software agents, and collaborative, group and distributed KA are also promising areas. We also intend to exploit work by the Edinburgh group to embrace the earliest phases of any system’s specification.

Theme 2: The Challenge of Knowledge Modelling

The real requirement of knowledge modelling is to understand the types of template and structure within which knowledge can be most usefully held, and reasoned with. Such structures include problem-solving methods and ontologies.

The Open University has developed a high level knowledge modelling language, while Shadbolt's team has developed a methodology for constructing models of a domain. Web-based ontology construction is supported by systems developed at the Open University which make it possible to browse, visualize and edit knowledge models over the web. Edinburgh have developed the Enterprise ontology for representing an organization and its activities, as well as having considerable experience in applying knowledge modelling to the knowledge engineering process. The Aberdeen group, as a result of empirical studies, identified a number of strategies which scientists used when a taxonomy and a specimen were inconsistent.

We believe that with a strong conception of what constitutes good modelling, based on considerable practical experience, and with the sorts of modelling tools outlined above, we can make real progress on the next challenge – that of reuse.

Theme 3: The Challenge of Knowledge Reuse

One of the most serious impediments to cost-effective knowledge intensive system construction is that usually they are built afresh. There is little reuse of existing domain content or problem solving experience.

The Open University has been exploring the use of problem-solving methods by developing a set of technologies to support the specification and reuse of problem-solving method components. As well as the modelling tools mentioned above, this research has produced a framework for developing extensive libraries of reusable components. Edinburgh have used ontologies to achieve reuse of knowledge components. Aberdeen have recently developed a semi-formal framework in which to relate problem-solving, KA and knowledge refinement. At Southampton the MEMOIR project is developing a framework for corporate information systems using distributed object-oriented database technology integrated with an open hypermedia system. The Open University is involved in the IBROW³ project, developing an intelligent broker to help configure a knowledge-based system from software components distributed over the WWW.

We look to develop and extend these approaches within the IRC so that knowledge component reuse can become a routine part of knowledge management.

Theme 4: The Challenge of Knowledge Retrieval and Extraction

In any large repository retrieval of knowledge is an issue. How do we recover a subset of content relevant to a problem or task? In some cases the ideal process is not one of retrieval but dynamic extraction. Automated methods to support retrieval and extraction are vital as are architectures to integrate such capabilities.

At Southampton MAVIS and MAVIS 2 aim to develop Multimedia Architectures for Video, Image and Sound. The Learning Agents group at Aberdeen has developed systems which learn individual user profiles (e.g. of WWW use) which can then suggest further selections.

One problem in knowledge retrieval is that searches often include material of peripheral interest and automated generation methods can generate much more formally expressed knowledge than humans can read. Robertson's group at Edinburgh has been looking at ways of pruning these structures.

The IMPS architecture developed in Shadbolt's group applies knowledge engineering techniques to the search for and organisation of knowledge from the WWW. The University of Sheffield has been developing scaleable architectures to exploit Natural Language processing methods. Other tools are able to provide natural language summaries of text and so can be exploited as knowledge extraction methods in their own right.

Theme 5: The Challenge of Knowledge Publishing

Assuming large repositories of well-structured, well-indexed knowledge can be built we then face the problem of how best to publish or disseminate this content. Knowledge as many recognise is only effective if it is delivered in the right form to the right person at the right time. Different users may want to see knowledge presented and visualised in quite different ways. The dynamic construction of appropriate perspectives is a long-standing challenge.

One way of separating form from content in knowledge publishing is by developing languages appropriate to particular problems and using program generation methods to construct appropriate visualisations. A practical example of this is the website of the Software Systems and Processes group at Edinburgh. In the same spirit the Open University have a project KMi Planet that is an online newspaper managed entirely by intelligent agent software, and D3E, a tool for non-technical users to publish WWW documents for debate. Southampton’s Multimedia Research Group has developed a multimedia authoring and presentation tool, Microcosm

Each of these approaches promises a great deal for the future of knowledge publishing. We now move on to a final knowledge challenge.

Theme 6: The Challenge of Knowledge Maintenance

Knowledge has a shelf life. Some knowledge has considerable longevity while other content changes very quickly. The challenge is to know when and how to update, and to be able to predict the consequences of any changes. Intriguingly there is scope to model the equivalent of forgetting in our knowledge systems.

A variety of techniques have been developed by Aberdeen's V&V group to detect inconsistencies, including the use of well chosen test cases and logical properties; further techniques remove the inconsistency in such a way as to preserve the integrity of the rest of the knowledge. Shadbolt's group has examined collaborative ontology construction tools, highlighting consensus, conflicts and design rationales. John Kingston at Edinburgh is involved in the US High Performance Knowledge Bases project, building and testing tools to cope with the problems of scale.

Research Agenda

The research agenda of this IRC is to address the six challenges laid out above in an integrated approach. Knowledge management has up until now tended to be done in various ad hoc ways, generated either by business imperatives or disparate academic research agendas. The partners in the IRC, as can be seen from the discussions above, have much expertise and experience in dealing with knowledge-intensive methodologies and technologies. The IRC will link together all of this highly-respected work in a unified context to provide a well-founded knowledge management methodology located centrally within a total systems engineering approach.

The Academic Partners

The consortium is extremely well placed to undertake the ambitious and challenging research agenda outlined above. The partners are all recognised centres of excellence in the key technology areas required to address this agenda. They each have a proven track record of deploying advanced research concepts in applied settings. The consortium has also attracted substantial funding in the past from national, international and commercial agencies.

The University of Aberdeen

Sleeman's group has focused on cooperative KA and knowledge refinement systems. It also has expertise in validation and verification of KBs (Preece), and in learning agents (Edwards) and distributed agent technologies (Norman). The KRAFT project focuses on the fusion of knowledge (constraints) from disparate. Digital libraries have become a further focus with the appointment of a sub-group led by Michael Freeston.

The University of Edinburgh

Tate and Kingston of the Artificial Intelligence Applications Institute have both researched into knowledge-based planning and high performance knowledge based systems, international knowledge systems, workflow process standards activities, and the development of commercially applicable and successful knowledge based systems. Robertson's group specialises in formal modelling and automation in software and knowledge engineering lifecycles. It is known internationally for its work on domain-specific problem description and automated generation of software via formal refinement and parameterisable structures

The Open University

The Knowledge Media Institute (KMI) at the Open University encompasses a broad programme of research into innovative approaches to sharing, accessing, and understanding knowledge. It has extensive expertise on developing internet-based services and on constructing and using knowledge modelling components. It has developed an extensive infrastructure for knowledge modelling, including languages, libraries, web-based knowledge acquisition interfaces and support for web-based interoperability.

The University of Sheffield

Professor Wilks’ group at the Department of Computer Science is recognised as being one of the major European centres of excellence in the area of computational linguistics. The group has developed tools and techniques that support the automatic analysis and generation of free and structured text. Sheffield's NLP Group has already developed a state-of-the-art Information Extraction system for the recent MUC-6 evaluation of information extraction from financial newswires.

The University of Southampton

The Multimedia Research Group is currently focused around a number of different areas, such as open hypermedia systems, distributed information and agent technology, image, video and audio processing and digital libraries. Shadbolt’s Group at Nottingham worked in knowledge engineering and acquisition, including the evaluation of KA methods, methodologies for construction of knowledge intensive systems and reuse of knowledge, and the construction of KA software. Shadbolt and members of his group will move to Southampton in January 2000.