| <HTML> |
| <BODY> |
| |
| <H2>Overview</H2> |
| Example that uses parallel_do to do parallel preorder traversal of a sparse graph. |
| <P> |
| Each vertex in the graph is called a "cell". |
| Each cell has a value. |
| The value is a matrix. |
| Some of the cells have operators |
| that compute the cell's value, using other cell's values as input. |
| A cell that uses the value of cell x is called a successor of x. |
| </P><P> |
| The algorithm works as follows. |
| <OL> |
| <LI> Compute the set of cells that have no inputs. This set is called <TT>root_set</TT>. |
| <LI> Each cell has an associated field <TT>ref_count</TT> that is an atomic integer. |
| Initialize <TT>ref_count</TT> to the number of inputs for the Cell. |
| <LI> Update each cell in <TT>root_set</TT>, by applying a <TT>parallel_do</TT> to a <TT>root_set</TT> |
| <LI> After updating a cell, for each of its successors |
| <OL> |
| <LI> Atomically decrement the successor's <TT>ref_count</TT> |
| <LI> If the count became zero, add the cell to the set of cells to be updated, |
| by calling <TT>parallel_do_feeder_impl::add</TT>. |
| </OL> |
| </OL> |
| </P><P> |
| The times printed are for the traversal and update, |
| and do not include time for computing the root_set. |
| </P> |
| <B>NOTE: </B>It is important to understand that this example is unlikely to show speedup |
| if the cell values are changed to type "float". The reason is twofold. |
| <UL> |
| <LI> The smaller value type causes each Cell to be significantly smaller than a cache line, |
| which leads to false sharing conflicts. |
| <LI> The time to update the cells becomes very small, and consequently the overhead of |
| parallel_do swamps the useful work. |
| </UL> |
| |
| <H2>Files</H2> |
| <DL> |
| <DT><A HREF="parallel_preorder.cpp">parallel_preorder.cpp</A> |
| <DD>Source code for example. |
| <DT><A HREF="Graph.cpp">Graph.cpp</A> |
| <DD>Source code for example. |
| <DT><A HREF="Graph.h">Graph.h</A> |
| <DD>Source code for example. |
| <DT><A HREF="Matrix.h">Matrix.h</A> |
| <DD>Source code for example. |
| <DT><A HREF="Makefile">Makefile</A> |
| <DD>Makefile for building example. |
| </DL> |
| |
| <H2>Directories</H2> |
| <DL> |
| <DT><A HREF="msvs">msvs</A> |
| <DD>Contains Microsoft* Visual Studio* 2005 workspace for building and running the example. |
| <DT><A HREF="xcode">xcode</A> |
| <DD>Contains Xcode* IDE workspace for building and running the example. |
| </DL> |
| |
| <H2>To Build</H2> |
| General build directions can be found <A HREF=../../index.html#build>here</A>. |
| |
| <H2>Usage</H2> |
| <DL> |
| <DT><TT>parallel_preorder [<I>M</I>[:<I>N</I>] [<I>Rounds</I> [<I>'pause'</I>]]]</TT> |
| <DD><I>M</I> and <I>N</I> are a range of numbers of threads to be used. |
| <DD><I>Rounds</I> is the number of rounds the example runs internally. Default value |
| is 50; reduce it to shorten example run time. |
| <DD>If 'pause' is specified, the application will wait for a user to hit return before it exits. |
| <DT>To run a short version of this example, e.g., for use with Intel® Threading Tools: |
| <DD>Build a <I>debug</I> version of the example |
| (see the <A HREF=../../index.html#build>build directions</A>). |
| <BR>Run it with the desired number of threads and smaller number of rounds, e.g., <TT>parallel_preorder 4 5</TT>. |
| </DL> |
| |
| <HR> |
| <A HREF="../index.html">Up to parent directory</A> |
| <p></p> |
| Copyright © 2005-2010 Intel Corporation. All Rights Reserved. |
| <p></p> |
| Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are |
| registered trademarks or trademarks of Intel Corporation or its |
| subsidiaries in the United States and other countries. |
| <p></p> |
| * Other names and brands may be claimed as the property of others. |
| </BODY> |
| </HTML> |
| |