Unrolled linked list

Unrolled linked list

In computer programming, an unrolled linked list is a variation on the linked list which stores multiple elements in each node. It can drastically increase cache performance, while decreasing the memory overhead associated with storing list metadata such as references. It is related to the skip list and the B-tree.

Overview

A typical unrolled linked list node looks like this:

record node { "node" next "// reference to next node in list" "int" numElements "// number of elements in this node, up to maxElements" "array" elements "// an array of numElements elements, with space allocated for maxElements elements" }

Each node holds up to a certain maximum number of elements, typically just large enough so that the node fills a single cache line or a small multiple thereof. A position in the list is indicated by both a reference to the node and a position in the elements array. It's also possible to include a "previous" pointer for an unrolled doubly-linked linked list.

To insert a new element, we simply find the node the element should be in and insert the element into the elements array, incrementing numElements. If the array is already full, we first insert a new node either preceding or following the current one and move half of the elements in the current node into it.

To remove an element, similarly, we simply find the node it is in and delete it from the elements array, decrementing numElements. If numElements falls below maxElements ÷ 2 then we pull elements from adjacent nodes to fill it back up to this level. If both adjacent nodes are too low, we combine it with one adjacent node and then move some values into the other. This is necessary to avoid wasting space.

Performance

One of the primary benefits of unrolled linked lists is decreased storage requirements. All nodes (except at most one) are at least half-full. If many random inserts and deletes are done, the average node will be about three-quarters full, and if inserts and deletes are only done at the beginning and end, almost all nodes will be full. Assume that:
* "m" = maxElements, the size of each elements array;
* "v" = the overhead per node for references and element counts;
* "s" = the size of a single element.Then, the space used for "n" elements varies between (v + sm)lceil n/m ceil, roughly (v/m + s)n, and twice this value. For comparison, ordinary linked lists require (v + s)n space, although "v" may be smaller, and arrays, one of the most compact data structures, require sn space. Unrolled linked lists effectively spread the overhead "v" over a number of elements of the list. Thus, we see the most significant space gain when overhead is large, maxElements is large, or elements are small.

If the elements are particularly small, such as bits, the overhead can be as much as 64 times larger than the data on many machines. Moreover, many popular memory allocators will keep a small amount of metadata for each node allocated, increasing the effective overhead "v". Both these make unrolled linked lists more attractive.

Another advantage of unrolled linked lists is that they perform a number of operations, typically associated with arrays, much more quickly than ordinary linked lists. For example, when indexing into unrolled linked list, we can progress a node at a time rather than an element at a time, reducing the indexing time to O(n/m) instead of O(n).

Unrolled linked lists also perform sequential traversal much more rapidly, due to their cache behavior: iterating through an ordinary linked list of "n" elements triggers "n" cache misses in the worst case, and about "2nv/m" in the best case, assuming "v" and "s" are about the same size. Iterating through an unrolled linked list of "n" elements triggers only "2n/m" cache misses in the worst case and "n/m" in the best. Even arrays can do little better than this best case, with about "n/(m+v/s)" misses. When "v" is about "s", linked lists have twice as many cache misses as unrolled linked lists in the best case, and "m"/2 times as many in the worst case, which in practice can speed up list traversal by as much as 10 times or more.

See also

* CDR coding, another technique for decreasing overhead and improving cache locality in linked lists similar to unrolled linked lists.
* the skip list, a similar variation on the linked list, also offers fast traversal but hurts the advantages of linked lists (quick insert/deletion) less than an unrolled linked list
* the B-tree and T-tree, data structures that are similar to unrolled linked lists in the sense that each of them could be viewed as an "unrolled binary tree"


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Linked list — In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference (in other words, a link) to the next node in the… …   Wikipedia

  • Hash table — Not to be confused with Hash list or Hash tree. Unordered map redirects here. For the proposed C++ class, see unordered map (C++). Hash Table Type unsorted dictionary Invented 1953 Time complexity in big O notation Average Worst case Space …   Wikipedia

  • Dynamic array — Several values are inserted at the end of a dynamic array using geometric expansion. Grey cells indicate space reserved for expansion. Most insertions are fast (constant time), while some are slow due to the need for reallocation (Θ(n) time,… …   Wikipedia

  • Double-ended queue — Not to be confused with Double ended priority queue. In computer science, a double ended queue (dequeue, often abbreviated to deque, pronounced deck) is an abstract data structure that implements a queue for which elements can only be added to or …   Wikipedia

  • Data structure — In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.[1][2] Different kinds of data structures are suited to different kinds of applications, and some are highly …   Wikipedia

  • Associative array — In computer science, an associative array (also called a map or a dictionary) is an abstract data type composed of a collection of (key,value) pairs, such that each possible key appears at most once in the collection. Operations associated with… …   Wikipedia

  • Circular buffer — A ring showing, conceptually, a circular buffer. This visually shows that the buffer has no real end and it can loop around the buffer. However, since memory is never physically created as a ring, a linear representation is generally used as is… …   Wikipedia

  • Collection (computing) — In computer science, a collection is a grouping of some variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the… …   Wikipedia

  • Tree (data structure) — A simple unordered tree; in this diagram, the node labeled 7 has two children, labeled 2 and 6, and one parent, labeled 2. The root node, at the top, has no parent. In computer science, a tree is a widely used data structure that emulates a… …   Wikipedia

  • Container (data structure) — For the abstract notion of containers in Type theory, see Container (Type theory). In computer science, a container is a class, a data structure[1][2], or an abstract data type (ADT) whose instances are collections of other objects. In other… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”