Cobweb (clustering)

Cobweb (clustering)

COBWEB is an incremental system for hierarchical conceptual clustering.

COBWEB incrementally organizes observations into a classification tree. Each node in a classification tree represents a class (concept) and is labeled by a probabilistic concept that summarizes the attribute-value distributions of objects classified under the node. This classification tree can be used to predict missing attributes or the class of a new object.

There are four basic operations COBWEB employs in building the classification tree. Which operation is selected depends on the category utility of the classification achieved by applying it. The operations are:

  • Merging Two Nodes
    Merging two nodes means replacing them by a node whose children is the union of the original nodes' sets of children and which summarizes the attribute-value distributions of all objects classified under them.
  • Splitting a node
    A node is split by replacing it with its children.
  • Inserting a new node
    A node is created corresponding to the object being inserted into the tree.
  • Passing an object down the hierarchy
    Effectively calling the COBWEB algorithm on the object and the subtree rooted in the node.

The COBWEB Algorithm

Algorithm COBWEB
  COBWEB(root, record):
  Input: A COBWEB node root, an instance to insert record
  if root has no children then
    children := {copy(root)}
    newcategory(record) \\ adds child with record’s feature values.
    insert(record, root) \\ update root’s statistics
  else
    insert(record, root)
    for child in root’s children do
      calculate Category Utility for insert(record, child),
      set best1, best2 children w. best CU.
    end for
    if newcategory(record) yields best CU then
      newcategory(record)
    else if merge(best1, best2) yields best CU then
      merge(best1, best2)
      COBWEB(root, record)
    else if split(best1) yields best CU then
      split(best1)
      COBWEB(root, record)
    else
      COBWEB(best1, record)
    end if
  end
  • "←" is a loose shorthand for "changes to". For instance, "largestitem" means that the value of largest changes to the value of item.
  • "return" terminates the algorithm and outputs the value that follows.

References

  • Fisher, Douglas H. (July 1987). "Improving inference through conceptual clustering". Proceedings of the 1987 AAAI Conferences. AAAI Conference. Seattle Washington. pp. 461–465. 



Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Cobweb — or cobwebs may refer to: Contents 1 Animals 2 Plants 3 Entertainment …   Wikipedia

  • Conceptual clustering — is a machine learning paradigm for unsupervised classification developed mainly during the 1980s. It is distinguished from ordinary data clustering by generating a concept description for each generated class. Most conceptual clustering methods… …   Wikipedia

  • Data stream clustering — In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied under the data stream… …   Wikipedia

  • Classification sous contrainte — En intelligence artificielle, la classification sous contrainte désigne une famille d algorithmes d apprentissage semi supervisée. Cette dernière occupe une place intermédiaire entre l apprentissage supervisé (pour laquelle les étiquettes de… …   Wikipédia en Français

  • Fouille de flots de données — Exploration de données Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes …   Wikipédia en Français

  • List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… …   Wikipedia

  • Category utility — is a measure of category goodness defined in Harvtxt|Gluck|Corter|1985 and Harvtxt|Corter|Gluck|1992. It was intended to supersede more limited measures of category goodness such as cue validity (Harvnb|Reed|1972;Harvnb|Rosch|Mervis|1975) and… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”