Content filtering

Content filtering

Content filtering is the technique whereby content is blocked or allowed based on analysis of its content, rather than its source or other criteria. It is most widely used on the internet to filter email and web access.


Content filtering of email

Content filtering is the most commonly used group of methods to filter spam. Content filters act either on the content, the information contained in the mail body, or on the mail headers (like "Subject:") to either classify, accept or reject a message.

The most popular filter is the Bayesian filter, which is a statistical filter.

Usually anti-virus methods can be classified as content filters too, since they scan simplified versions of either the binary attachments of mail or the HTML contents. Content filters can also refer to parental controls software that analyzes data and either restricts the data or changes the data as with chat filtering. Depending on where content or packets are filtered in the OSI or Internet model, content filtering will refer to technologies designed to ascertain the logic of data and that depends on the application, spam, viruses, computer worms, denial-of-service attacks, trojans, spyware, human understandable subject of data and much more because to an extent it depends on the application or user requirements, hate websites, swear words, chat application subject matter.

It is important to note that the Internet does not have a clear security model standard designed to limit the extent of security incidents such as worms which could potentially overload the Internet causing a global denial of service. Developing intelligent and sophisticated content filtering technology with standards and cooperation among ISPs may be the solution.

Content filtering of web content

Content filtering is commonly used by organizations such as offices and schools to prevent computer users from viewing inappropriate web sites or content, or as a pre-emptive security measure to prevent access of known malware hosts. Filtering rules are typically set by a central IT department and may be implemented via software on individual computers or at a central point on the network such as the proxy server or internet router. Depending on the sophistication of the system used, it may be possible for different computer users to have different levels of internet access.

Content filtering software is sometimes also used on home computers in order to restrict access to inappropriate websites for children using the computer. Such software is typically described as parental control software.

Filtering methods

Common content filtering methods include:

  • Attachment - The blocking of certain types of file (e.g. executable programs).
  • Bayesian
  • DNS Based filtering -
  • Char-set
  • Content-encoding
  • Heuristic - Filtering based on heuristic scoring of the content based on multiple criteria.
  • HTML anomalies
  • Language
  • Mail header - Filtering based solely on the analysis of e-mail headers. Made less effective by the ease of message header forgery.
  • Mailing List - Used to detect mailing list messages and file them in appropriate folders.
  • Phrases - Filtering based on detecting phrases in the content text.
  • Proximity - Filtering based on detecting words or phrases when used in proximity.
  • Regular Expression - Filtering based on rules written as regular expressions.
  • URL-Filtering based on the URL. Suitable for blocking websites or sections of websites.

Most content filtering systems use a combination of techniques.

See also

  • Application service architecture

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • content filtering — noun Web filtering, content control software or spam blocking solutions …   Wiktionary

  • Content-control software — DansGuardian blocking …   Wikipedia

  • Content-based image retrieval — (CBIR), also known as query by image content (QBIC) and content based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in… …   Wikipedia

  • Content Vectoring Protocol — (CVP) is a protocol for filtering data that is crossing a firewall into an external scanning device. An example of this is where all HTTP traffic is virus scanned before being sent out to the user. This protocol is identified as part of the… …   Wikipedia

  • Content (media) — In media production and publishing, content is information and experiences that may provide value for an end user/audience in specific contexts. Content may be delivered via any medium such as the internet, television, and audio CDs, as well as… …   Wikipedia

  • Content (media and publishing) — In media production and publishing, content is information and experiences that may provide value for an end user/audience. Content may be delivered via any medium such as the internet, television, and audio CDs, as well as live events such as… …   Wikipedia

  • Content Discovery Platform — A Content Discovery Platform is an implemented software platform for consumers to search for television content online[1] using recommender system tools such as recommendations or TV search engine. It can be used to deploy new services or enhance …   Wikipedia

  • Content re-appropriation — Fundamental to modern information architectures, and driven by semantic Web technologies, content re appropriation is the act of searching, filtering, gathering, grouping, and aggregation which allows information to be related, classified and… …   Wikipedia

  • filtering — Ⅰ. filter UK US /ˈfɪltər/ noun [C] IT ► a computer program that receives and processes information before displaying it or preventing it from being seen: »Junk email filters help define which email messages are spam. »This web filter enables… …   Financial and business terms

  • Content-filter — Eine externe (Netzwerk oder Hardware ) Firewall (von engl. firewall [ˈfaɪəwɔːl] „die Brandwand“) stellt eine kontrollierte Verbindung zwischen zwei Netzen her. Das könnten z. B. ein privates Netz (LAN) und das Internet (WAN) sein; möglich ist… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.