WPF Drawing Performance

Starting with version 3.0, the .NET Framework provides two incompatible and unrelated graphics APIs, both aimed at general GUI application development:

  • Windows Forms wraps the GDI+ API introduced in Windows XP, which in turn extends the GDI (Graphics Device Interface) API that dates back to the first versions of Windows. The original interface languages for GDI and GDI+ are C and C++, respectively.
  • Windows Presentation Foundation (WPF) is based on DirectX and exposed exclusively through managed code.

WPF was developed for Windows Vista whose new Desktop Window Manager (DWM) is likewise based on DirectX rather than GDI. The DWM is enabled by switching to the Aero desktop theme (the default on most editions), and disabled by switching to the Basic theme which emulates Windows XP.

Rumor has it that Vista was originally intended to use WPF for its entire GUI, but the performance of the new API was not up to the task. Certainly, developers outside Microsoft have frequently criticized WPF for its sluggish performance, especially compared to GDI/GDI+ on Windows XP.

On this page I attempt to measure the performance of simple drawing operations in both WPF and Windows Forms (i.e. GDI+) under a variety of conditions. For comparison, I implemented the same operations in Java’s AWT (Abstract Window Toolkit). Hopefully the results will prove useful to other developers. The test application and its source code are available for download, so you can run your own tests and modify the test cases as desired.

  • Measuring WPF Performance, a rather tricky procedure
  • Drawing Test Application with documentation and download
  • Sample Test Results for various configurations on my system
  • Test Conclusions drawn from these results

Before moving on, I’d like to recommend Jeremiah Morrill’s Critical Deep Dive into the WPF Rendering System. This post is largely unrelated to the following discussion, but it’s a fascinating examination of WPF performance at the lowest level.

Measuring WPF Performance

Attempting to measure the time WPF takes to fully render a window is surprisingly difficult. This is because two different threads collaborate in this task. To explain what that means, here’s a quick overview of how WPF shows things on the screen.

  • WPF operates in retained mode. Calling a drawing method stores the indicated content in some internal format but does not alter the screen. At some unspecified later time, a background thread renders the prepared content to the screen. Rendering happens automatically and repeatedly when necessary, e.g. when an obscured window is uncovered.
  • By contrast, GDI/GDI+ and hence Windows Forms operate in immediate mode. Calling a drawing method immediately renders the indicated content to the screen before the method returns. However, this content is never stored, so the application must repeat those same method calls whenever the same content should be re-rendered.

An immediate mode API can simulate a primitive sort of retained mode by drawing to a memory buffer which is later copied to the screen. Such buffering is frequently used for better performance or smoother animations. Our drawing test application provides test windows for both direct and buffered GDI+.

Since WPF uses retained mode, all new display content passes through two stages before appearing on the screen: first internal preparation, then the actual rendering. WPF implements these stages as follows:

1. Preparation — This includes computing the sizes and layout of all WPF objects to be rendered, as well as recording the actual drawing operations. Any WPF methods that you call explicitly, for example within an OnRender override, are part of this stage.

All preparations are handled synchronously by the message loop running on the (usually single) GUI thread, which is accessible through the Dispatcher property. This is the same mechanism that transmits user input to Windows applications, and it’s the reason why WPF won’t update the display until your topmost event handler has returned. The GUI thread cannot process any drawing operations while it’s in your code – it must return to the message loop first.

(There’s a dangerous trick to get around this, known as DoEvents after the eponymous Windows Forms method, which tells the Dispatcher to immediately work through all pending messages. The drawing test application uses this trick to clear the message queue before the test timer starts.)

2. Rendering — When all preparations are complete, a separate background thread eventually renders the prepared content to the screen. Unfortunately, this thread is completely hidden from user code, and WPF offers no (direct) way to tell when an object has finished rendering. This is a rather big problem for responsive GUI design, as several users have discovered…

(This mechanism also explains why WPF, an API based on DirectX, doesn’t expose a DirectX interface for user drawings. Only the background render thread interacts with DirectX, so any user-supplied DirectX code would have to somehow insert itself into this thread. It’s difficult to see how that could work without messing up existing functionality.)

Measuring Windows Forms is easy: we start a timer before showing a test window, and stop it at the end of the window’s OnPaint handler. Since Windows Forms operates in immediate mode, the entire window has been fully rendered to the screen at that point. Java’s AWT likewise operates in immediate mode, and can be measured in the same way.

Measuring WPF is more difficult. Once again, we start a timer before showing a test window. But now we need to measure both stages of WPF’s retained mode to find the total time until the window has actually been rendered to screen.

Measuring Preparation

This stage is complete when the UI thread’s message loop has processed all pending messages, which (we assume and hope) all originated from our drawing operations. WPF exposes a Window.­ContentRendered event that is perfect for this purpose. Despite its name, this event fires after all window contents have been prepared for rendering for the first time. We react by setting a flag in our test application that activates measurement of the second stage.

Measuring Rendering

We cannot directly access the rendering thread but WPF does offer one indirect point of access, namely through the CompositionTarget.­Rendering event. This event usually fires at the monitor refresh rate (typically 60 times per second), whether there’s any new content to render or not. It is primarily intended for custom animations that need to generate display updates as quickly as the monitor can show them.

However, the Rendering event is tied to the render thread in a way we can exploit: the event is notraised as long as the render thread is busy! It will be raised again at some point during the next refresh interval after the render thread has gone idle. Since we set a flag immediately after the preparations stage was complete, we can now examine that flag in our Rendering handler. If the preparations flag is still set, we know that a test window has just been rendered and we can record the elapsed time.

This trick is not foolproof. Sometimes the Rendering event fires just after a test window has been prepared, but before the render thread has actually started working on it. We circumvent this problem by comparing the event time to the time when the preparations flag was set. If the difference is less than 100 msec (a value tailored to our benchmark), we assume that rendering has not yet happened and wait for the next event to arrive.

Drawing Test Application

All results shown below were obtained with a small test application. The download packageDrawingTest.zip (31.8 KB, ZIP archive) comprises the precompiled application for the .NET Framework 4.0 and the complete source code for Visual Studio 2010, as well as a version for Java’s AWT library.

The test application draws 10,000 triangles to a window’s client area, sized 400×400 screen pixels (for GDI+ and AWT) or device-independent units (for WPF). Each triangle is rotated 1° clockwise compared to the previous one. Triangles are drawn either as outlines using pens (“Pens Only”), filled shapes using brushes (“Brushes Only”), or both with different colors (“Pens & Brushes”). All colors are solid, with no patterns, shading, or animation effects of any kind.

The test application for Java’s AWT library is located in a separate folder and run from the command line – please see the enclosed ReadMe file for instructions. The test application for GDI+ and WPF provides a GUI with five buttons on the left start each test window, as follows:

  • GDI+ Direct (Alt+D) — Shows a Windows Forms window (a.k.a. Form) whose OnPaint handler callsGraphics.­DrawPolygon and/or Graphics.­FillPolygon to draw the triangles directly to the window. We enable alpha blending (SourceOver) and high-quality compositing to replicate WPF behavior, but testing with SourceCopy and high-speed compositing showed no measurable difference.
  • GDI+ Buffer (Alt+B) — Shows a Windows Forms window whose OnPaint handler creates aBufferedGraphics object covering the entire client area, then calls Graphics.­DrawPolygon and/orGraphics.­FillPolygon to draw the triangles to that buffer, and finally renders the buffer to the window. (This is equivalent to setting the DoubleBuffered or ControlStyles.­OptimizedDoubleBuffer flag on the Form.)
  • WPF Line (Alt+L) — Shows a WPF window whose OnRender handler calls DrawingContext.­DrawLinethree times for each triangle. The DrawingContext class does not expose a method to fill arbitrary polygons, so this test supports only the “Pens Only” option.
  • WPF Path (Alt+P) — Shows a WPF window whose OnRender handler creates a PathFigure for each triangle, then a PathGeometry containing the figure, and finally calls DrawingContext.­DrawGeometryto draw that geometry.
  • WPF Stream (Alt+S) — Shows a WPF window whose OnRender handler creates a StreamGeometry for each triangle, which is once again drawn by DrawingContext.­DrawGeometry.

To minimize interference with the test timer, I recommend that you move the mouse cursor away from the application and test windows, and start all tests with keyboard shortcuts rather than mouse clicks. If you use high DPI mode, you’ll notice that the Windows Forms and AWT windows appear smaller than the WPF windows. This is correct and due to the fact that WPF automatically scales all coordinates by the current DPI setting, whereas Windows Forms and AWT do not.

Anti-Aliasing

Anti-aliasing, i.e. smoothing the edges of diagonal lines, turns out to have a huge performance impact on all measured drawing APIs. Anti-aliasing is disabled by default for GDI+ and AWT, and enabled by default for WPF. Use “Anti-Aliasing On/Off” to change this setting which is implemented as follows:

  • GDI+ — Set SmoothingMode on the current Graphics object.
  • WPF — Set RenderOptions.­EdgeMode for the current Window.
  • AWT — Set RenderingHints.­KEY_ANTIALIASING on the current Graphics2D object.

Note that direct (unbuffered) GDI+ on Windows XP does not support anti-aliasing at all; the corresponding SmoothingMode flag is simply ignored. This is likely a limitation of GDI hardware acceleration on that platform.

WPF Freezing

The figures and geometries created by the WPF Path & Stream tests are always frozen. Testing showed that leaving them unfrozen makes no discernable difference. However, freezing the pens and brushes used by the three WPF windows makes a very big difference, so this feature is controlled by one last option. All WPF pens and brushes are initially unfrozen until you click the “Freeze WPF” button, at which point they remain frozen until the application is closed.

Limitations

The application tests exactly one thing: drawing the outlines and/or interiors of many triangles in solid colors. It does not test anything else, including the following:

  • Standard GUI elements, although those ultimately rely on primitive drawing operations such as the ones that are being tested.
  • Advanced features such as patterns, shading, or animation. WPF is much more powerful in this regard than GDI/GDI+.

If you are interested in the performance of some specific drawing operation that is not covered by the application, you should modify its source code to run your own customized tests on your target system. This is ultimately the only way to find reliable answers.

Sample Test Results

The following tables show sample test results on my system, comprising an Intel DX58SO motherboard with an Intel Core i7 920 CPU (2.67 GHz), 6 GB RAM (DDR3-1333), and an AMD Radeon HD 6970 (2 GB) graphics card, with current AMD and DirectX drivers. The tests were not conducted with any kind of scientific rigor; I simply ran each test several times and picked a nice round median value. In each table, the first three rows were measured with anti-aliasing disabled and the last three rows (“AA +”) with anti-aliasing enabled.

The first table shows the test results, in milliseconds, for Windows XP SP3 (32 bit, 96 dpi, DirectX 9.0c) running in Virtual PC on Windows 7 SP1 (64 bit).

Windows XP GDI+ WPF Unfrozen WPF Frozen
  Direct Buffer Line Path Stream Line Path Stream
Pens Only 160 390 10,800 2,050 1,850 800 950 750
Brushes Only 300 1,120 1,600 1,380 1,300 1,100
Pens & Brushes 460 1,470 3,350 3,150 1,950 1,720
AA + Pens Only 4,760 13,100 4,800 4,500 3,150 3,600 3,400
AA + Brushes Only 3,930 3,000 2,750 2,650 2,450
AA + Pens & Brushes 8,750 7,400 7,250 6,000 5,800

The second table shows the test results, in milliseconds, for Windows 7 SP1 (64 bit, 120 dpi) with the Desktop Window Manager disabled (Windows 7 Basic scheme).

Windows 7 Basic GDI+ WPF Unfrozen WPF Frozen
  Direct Buffer Line Path Stream Line Path Stream
Pens Only 3,000 360 10,300 1,850 1,650 500 780 570
Brushes Only 3,950 580 680 480 400 200
Pens & Brushes 6,900 910 2,250 2,050 880 680
AA + Pens Only 7,400 4,550 14,000 6,300 6,100 4,050 5,200 5,000
AA + Brushes Only 7,400 3,850 680 480 400 200
AA + Pens & Brushes 14,800 8,450 6,700 6,480 5,300 5,100

The third table shows the test results, in milliseconds, for Windows 7 SP1 (64 bit, 120 dpi) with the Desktop Window Manager enabled (Windows 7 Aero scheme).

Windows 7 Aero GDI+ WPF Unfrozen WPF Frozen Java
  Direct Buffer Line Path Stream Line Path Stream AWT
Pens Only 20,000 350 10,400 1,890 1,680 500 770 560 180
Brushes Only 17,300 580 700 480 400 180 520
Pens & Brushes 36,000 920 2,300 2,080 880 670 590
AA + Pens Only 25,600 4,500 13,800 6,200 6,000 4,050 5,150 4,950 5,500
AA + Brushes Only 27,800 3,800 680 480 400 190 4,800
AA + Pens & Brushes 55,400 8,400 6,700 6,450 5,300 5,070 10,200

This table also shows test results for Java, using the standard AWT library from the Sun/Oracle JDK 1.6u26. As you can see, AWT’s performance is roughly comparable to buffered GDI+.

Test Conclusions

I make two assumptions in my following attempt to interpret these results:

  1. My Windows XP results, measured within Virtual PC, are representative of a native Windows XP installation, at least insofar as the relationship between GDI+ and WPF is concerned. I make this assumption because I would expect an additional slowdown from the virtual environment, if anything; but instead many tests run faster than on my native Windows 7 installation. (See below for a note on fill rates, however.)
  2. My Windows 7 results are also representative of Windows Vista. I make this assumption because both operating systems use the new driver and display architecture that was introduced in Vista. In any case, adoption rate of Vista has been so slow that this probably doesn’t matter anyway.

Once again, I encourage you to download the test application and try it on your own system(s), modifying the test code to your own requirements if necessary. Still, based on my test results as they stand, I’m inclined to draw the following conclusions:

Direct GDI+ is extremely system-dependent — The architectural changes between XP and Vista slowed direct GDI+ operations by two orders of magnitude, whereas WPF shows a significant but smaller variance only in fill rates (see below). On Windows XP, unbuffered GDI+ is 3–67 times faster than WPF; on Windows 7 Basic, between 3 times faster and 37 times slower; and on Windows 7 Aero, never faster and up to 146 times slower!

Conclusion: Unbuffered GDI+ is a great choice for custom drawing on XP (if you don’t need anti-aliasing), but it’s completely useless on newer systems where WPF is usually faster. Buffered GDI+, on the other hand, delivers consistent and competitive performance across all systems – and also supports anti-aliasing on Windows XP.

Anti-aliasing is (almost) always extremely slow — Surprisingly, the fact that WPF enables anti-aliasing by default is the single biggest factor in its apparent slowness compared to other APIs. Turning off AA improves performance in most tests by an order of magnitude, and using identical AA settings dramatically shrinks the performance difference between all three APIs.

Conclusion: Good drawing performance in any of the tested APIs requires disabling anti-aliasing. Once you do that, the choice of API is nearly irrelevant. If you require good performance with AA enabled, however, you’ll need to write raw DirectX or OpenGL code that utilizes your video card’s hardware AA (but see below on WPF brushes).

Freezing WPF pens & brushes is always a good idea — The basic DrawLine method is highly sensitive to this simple optimization and runs 3–20 times faster with frozen pens. One reason for this large speedup is that DrawLine is called three times per triangle, evaluating the current pen each time. TheGeometry methods are less sensitive but freezing pens & brushes still yields a speedup of 10–300%, depending on the system and operation.

Conclusion: Always immediately call Freeze on any freezable WPF object that you don’t want to animate or otherwise change in the future.

More complex WPF APIs are not necessarily faster — The complex “low-level” APIs PathGeometry andStreamGeometry beat equivalent DrawLine calls only when using unfrozen pens, and StreamGeometrysignificantly outperforms PathGeometry only at hardware-accelerated fill rates. However, note that I only tested very small geometries (i.e. triangles). Larger collections of geometric primitives should improve the relative performance of the Geometry APIs, especially when reused in multiple drawings.

Conclusion: Don’t expect miracles from the complex Geometry APIs. Unless the created geometries are reused, disabling anti-aliasing and freezing all possible WPF objects should yield a much greater speedup.

WPF brushes can be much faster than WPF pens — In most tests, filling triangles with brushes is about as fast as drawing their outlines. This changes dramatically for WPF on Windows 7: using the same drawing technique, brushes are always 2–26 times faster than pens. Even more intriguing, the usual anti-aliasing penalty vanishes completely for WPF brushes – but not for pens! I believe that we observe here the fabled “DirectX acceleration” of WPF, so lamentably unnoticeable in most operations, and that the slower fill rates on XP are due to the graphics card emulation provided by Virtual PC.

Conclusion: When running on modern systems with fast graphics cards, try using WPF brushes instead of WPF pens where possible. This may even allow you to keep anti-aliasing enabled. Unfortunately, this trick is system-dependent and probably won’t work on cheap laptops or old office desktops.

Epilogue

Why does WPF have a reputation for being slow? As far as drawing geometric objects is concerned, the apparent reason is that its designers chose two unusual default values: all objects are drawn with anti-aliasing, and most object data is retrieved from expensive mutable dependency properties.

There are good reasons for both choices. Enabling AA by default is necessary since WPF supports automatic display scaling, but its enormous performance impact is virtually unknown and should have received more publicity. WPF objects must remain mutable until all properties have been initialized, but most objects are never animated or otherwise changed afterward. Perhaps pens & brushes created by parameterized constructors should be frozen by default – or perhaps WPF would have been better off without the elegant but slow dependency property mechanism.

Fortunately, once these two big performance stumbling blocks are known they are easy to work around. Calling Freeze on all eligible WPF objects is tedious but trivial, and the single lineRenderOptions.­SetEdgeMode(this, EdgeMode.­Aliased); in a control’s constructor disables anti-aliasing for all its contents.

你可能感兴趣的:(C#)