| <HTML> |
| <BODY> |
| |
| <H2>Overview</H2> |
| Polygon Overlay example that demonstrates the use of parallel_for. |
| <P> |
| This example is a simple implementation of polygon overlay, as described in |
| <a href="http://citeseer.ist.psu.edu/cache/papers/cs/11981/ftp:zSzzSzftp.cs.vu.nlzSzpubzSzbalzSzcowichanzSzPolygonzSzreport.pdf/langendoen95parallelizing.pdf"> |
| <i>Parallelizing the Polygon Overlay Problem Using Orca</i>, by H.F. Langendoen</a>. |
| </P> |
| The solution was implemented in three forms: |
| <UL> |
| <LI>The naive serial solution. |
| <LI>The naive parallel solution, by splitting list of polygons from one map and intersecting |
| each sub-list against the entire list of polygons from the second map. |
| <LI>A parallel solution where each map is split into submaps, with each resulting submap being |
| intersected against the corresponding submap from the other map. This solution requires some |
| redundancy (some polygons are members of more than one submap). To prevent multiple copies |
| of a polygon from being placed in the solution map, if both polygons are duplicated (that is, |
| if they both appear in more than one map), they are intersected but the result is not placed |
| in the solution map. |
| </UL> |
| The only optimization in each solution is that the area of the generated sub-polygons are subtracted from |
| the original area of one of the source polygons. When the remaining area is zero, the intersection process |
| is halted. |
| <P> |
| <i>A word about the speedup of the submap case.</i> One may get superlinear speedup in this case (for instance a |
| laptop with Intel® Core(TM) Duo processor got a speedup of about 20 percent over serial.) This results from two effects: |
| </P> |
| <UL> |
| <LI>the number of threads used, and |
| <LI>the fact that for each submap, the number of polygons is smaller than that for the other two cases. |
| </UL> |
| If there are, say, 400 polygons in each map, then on average the number of intersections calculated is |
| approximately 80,000 (400 * 200, where 200 is the average number of polygons examined before stopping.) |
| If the maps are split into 2 submaps, the time for each submap is about 200*100, or 20,000. So even |
| comparing the two sets of submaps serially should result in a speedup somewhere around 2. This number |
| is affected by the number of redundant polygons being compared; this effect would eventually swamp the gain |
| from comparing smaller numbers of polygons per submap. And remember the submaps are created by intersecting each |
| map with a rectangular polygon covering the submap being generated, which is additional work taking about N * O(400) |
| in the case above, where N is the number of submaps generated, that can be done in parallel. |
| <P> |
| Running the default release pover while varying the number of submaps from 1 to 1000, the speedup on the submap |
| case for a 2-processor system looks like<br> |
| <img src="speedup.gif" alt="Table of speedup for the algorithm"><br> |
| </P> |
| <P> |
| One further optimization would be to sort one map, say <b>map1</b>, by maxY, and sort the other map (<b>map2</b>) |
| by minY. For <b>p1</b> in <b>map1</b>, start testing for intersection at the first <b>p2</b> in <b>map2</b> |
| that intersected the last polygon tested in <b>map1</b>. This would speed up the intersection process greatly, |
| but the optimization would apply to all the methods, and the sort would have to be accounted for in the timing. |
| </P> |
| <P> |
| The source maps are generated pseudo-randomly in the manner described in the paper above. That is, if |
| we need N polygons, then N "boxes" are chosen at random, then one-at-a-time the areas are expanded in |
| one of fours directions until the area hits an adjacent polygon. When this process is finished, the |
| resulting map is inspected and any remaining unoccupied "boxes" are made into additional polygons, as |
| large as possible in each case. So the actual number of polygons in each map will in general be larger |
| than the number of polygons requested (sometimes by 10% or more.) |
| </P> |
| <P> |
| One limitation of the program is that if the number of polygons in the source map is greater than the number of |
| "boxes" (pixels in the GUI case), the maps cannot be generated. |
| </P> |
| |
| <H2>Files</H2> |
| <DL> |
| <DT><A HREF="polyover.cpp">polyover.cpp</A> |
| <DD>Source code for main program. |
| <DT><A HREF="polyover.h">polyover.h</A> |
| <DD>Global variables, classes and enums. |
| <DT><A HREF="pover_video.cpp">pover_video.cpp</A> |
| <DD>Source code for the GUI interface. |
| <DT><A HREF="pover_video.h">pover_video.h</A> |
| <DD>Defines for the GUI version. |
| <DT><A HREF="Makefile">Makefile</A> |
| <DD>Makefile for building example. |
| </DL> |
| |
| <H2>Directories</H2> |
| <DL> |
| <DT><A HREF="msvs">msvs</A> |
| <DD>Contains Microsoft* Visual Studio* 2005 workspace for building and running the example. |
| <DT><A HREF="xcode">xcode</A> |
| <DD>Contains Mac OS* Xcode* workspace for building and running the example. |
| </DL> |
| |
| <H2>To Build</H2> |
| General build directions can be found <A HREF="../../index.html#build">here</A>. For the various UI options, see the <A HREF="../../common/index.html">common GUI code</A> build instructions. |
| |
| <P> |
| For Windows* systems, Microsoft* Visual Studio* projects are provided for each of the above versions. |
| </P> |
| |
| <H2>Usage</H2> |
| Building via the above make commands, or via Visual Studio projects on Windows* systems, produces executable files |
| named pover.exe. To run these executables directly, use one or more of the following commands. |
| <DL> |
| <DT><TT>pover.exe</TT> |
| <DD>Run this version (release or debug). |
| <DT><TT>pover.exe n:m</TT> |
| <DD>Run this version (release or debug) (m-n+1) times, with n threads to m threads inclusive. |
| <DT>To run a short version of this example, e.g., for use with Intel® Threading Tools: |
| <DD>Build a <I>debug</I> version with the GUI turned off |
| (e.g., <TT>make UI=con debug</TT>; see also the build directions above). |
| <BR>Run it with a small dataset, e.g., <TT>pover.exe --polys 10 --size 5x5</TT>. |
| </DL> |
| |
| <H2>Notes</H2> |
| <UL> |
| <LI>While running with the GUI display should yield reasonable performance in most cases, <I>running with no GUI |
| display is strongly recommended</I> in order to demonstrate the full performance and scalability of the example. |
| <LI>If using the X-windows (X11) GUI on Mac OS* systems, X11 might not be installed on the system by default. |
| To install X11 on Mac OS* systems, use the operating system install disk, choose "Optional installs" and select X11 from |
| the "Applications" list. Alternatively, if X11 is not available, build without the GUI (see build targets above). |
| </UL> |
| |
| <HR> |
| <A HREF="../index.html">Up to parent directory</A> |
| <p></p> |
| Copyright © 2005-2010 Intel Corporation. All Rights Reserved. |
| <p></p> |
| Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are |
| registered trademarks or trademarks of Intel Corporation or its |
| subsidiaries in the United States and other countries. |
| <p></p> |
| * Other names and brands may be claimed as the property of others. |
| </BODY> |
| </HTML> |