The last couple weeks I have been working on an OGR DXF driver. AutoCAD DXF format is a popular interchange format for CAD data, and to a somewhat lesser extent for goespatial map data often originating from engineering departments. It is a rather ugly format. Even though it is ASCII it is less than fun for humans to scan.
The raw machinery of the format is published by Autodesk, and lots of translators have been written for it in the past. However, I find it very frustrating that the format specifications fail to address the semantics of the format to any meaningful degree. It is assumed, I guess, that the person reading them is already deeply familiar with the AutoCAD data model.
So, for instance, it talks about the BLOCKS section, and the INSERT entity, but it never really explains that by defining a bunch of entities as a block, and then putting them into the drawing it makes it possible to treat a bunch of entities as a group. This has to be deduced. I have an old dxf file produced by FME that uses this as a mechanism to group a set of polylines that form a multipolygon, but I'm not sure how widely this approach is used.
In my driver, I have implemented two mechanisms for supporting blocks. In the case of non-text entities I try to accumulate the geometries I derive from the entities in the block into one OGC geometrycollection. But I also found that blocks in some drawings are used to group sets of text elements. In this case if I only use the geometries, I've discarded the important component - the text - so I collect the text entities as full features. And when I encounter the INSERT entity I push the original set of features into the input stream.
I spent most of a day implementing read support for the DIMENSION entity. This is essentially supposed to be a single object that shows a measurement on a drawing. The red part below is a single dimension element as rendered by QCAD.
Of course, there is no clear analog to this in OGR. So I ended up creating a single feature with a MULTILINESTRING geometry for the leaders, and arrows, and then pushed the label as a separate point feature with a LABEL style string.
I have made some effort to capture the drawing style information so that features can, in theory, be drawn similarly to AutoCAD by OGR applications. There is still really only one way to do this in OGR and that is to provide styling information as OGR Feature Style strings. This is a format that Daniel Morissette, myself and I believe Stephane Villeneuve defined many years ago as based closely on the sorts of styling supported by the mapinfo format(s). A few extensions have been made over the years, and a few OGR drivers utilize it to return styling information including DGN, and MapInfo. Andrey Kiselev has also done some stuff with it, so I presume there is a driver of his that uses it. So far the only two application that I know of using the feature style information are OpenEV and MapServer (via OGR autostyling).
I'm quite conflicted about the OGR feature style specification. I just hate to define "yet another" way of describing feature styling that is not closely related to any proper standard. I can't help but think there ought to be something more standards oriented to use as a basis. Perhaps OGC SLD, or SVG or something like that. But SLD didn't exist when we started, and does not express a lot of what I want. I don't really know that much about SVG - perhaps it would be a good choice. So, lacking a clear plan each time I need styling I do a bit more work on the OGR feature style specification while remaining less than fully committed to it as a long term thing.
Another aspect that was challenging was all the curves. I added the OGRGeometryFactory::approximateArcAngles() method as a generic mechanism to approximate arcs on an ellipsoid or circle as linestrings. The code was adapted from the Oracle driver and similar code exists in the DGN, and NTF drivers too I believe. So, finally this moves into the core.
One of the DXF curve types is a spline. Review the QCAD source code I found they implemented the spline rendering using rationale b-spline curves derived from Chapter 4 of An Introduction to NURBS by David Rogers. They release under the GPL so I can't directly use the code from QCAD without also putting GDAL/OGR under GPL restrictions, so I contacted the original author for permission to publish this code under the MIT/X license, like GDAL. He has indicated he has received my email and is considering the request ... so I wait.
In addition to the specification, I have also found the QCAD (and it's underlying dxflib library) to be a useful reference. I considered using dxflib but it really does not seem to solve any of the hard parts for me, and it would have added a dependency on an external library - complicating building of GDAL/OGR. The other very helpful resource was the v.in.dxf code from GRASS. This is a slightly less sophisticated dxf reader than OGR (IMHO) but it provided a very easy to understand implementation of a DXF reader for GIS data that was especially helpful in writing the proposal for the DXF driver. I also used it as a reference when implementing some of the element translations.
Before closing, I would like to thank Andreas Neumann and the City of Uster who have provided funding for this development. It seems that local government in Switzerland punches above it's weight in the free gis software world (the Kanton of Solothurn is also a big supporter of qgis, and related technologies as I understand). If only one in ten cities in the world made serious use of free gis software and provided enough financial support for one core developer it would have a huge boosting effect.
Anyways, preliminary DXF read support is in GDAL SVN now. Consider trying it out and providing feedbac.