500 Lines or Less | a 3d Modeller
At their core, CAD tools are a method of abstracting the 3-dimensional design into something that can be viewed and edited on a 2-dimensional screen.
CAD tools must offer three basic pieces of functionality. Firstly, they must have a data structure to represent the object that’s being designed: this is the computer’s understanding of the 3-dimensional world that the user is building. Secondly, the CAD tool must offer some way to display the design on the user’s screen. The user is designing a physical object with 3 dimensions, but the computer screen has only 2 dimensions. The CAD tool must model how we perceive objects, and draw them to the screen in a way that the user can understand all 3 dimensions of the object. Thirdly, the CAD tool must offer a way to interact with the object being designed.
The driving force behind many of the design decisions in a 3D modeller is the rendering process.
We would rather not communicate directly with graphics drivers, so we use a cross-platform abstraction layer called OpenGL, and a library called GLUT (the OpenGL Utility Toolkit) to manage our window.
OpenGL is a graphical application programming interface for cross-platform development. It’s the standard API for developing graphics applications across platforms. OpenGL has two major variants: Legacy OpenGL and Modern OpenGL.
Rendering in OpenGL is based on polygons defined by vertices and normals. For example, to render one side of a cube, we specify the 4 vertices and the normal of the side.
Modern OpenGL, on the other hand, features a programmable rendering pipeline where the programmer writes small programs called “shaders” that run on dedicated graphics hardware (GPUs).
GLUT, which is bundled with OpenGL, allows us to create operating system windows and to register user interface callbacks.
In computer graphics, it is convenient to use multiple different coordinate spaces for different types of points.
Transformation matrices convert points from one coordinate space to another coordinate space. To convert a vector v from one coordinate space to another, we multiply by a transformation matrix M: v′=Mv. Some common transformation matrices are translations, scaling, and rotations.
There is one Scene object, owned by the viewer. The Scene instance keeps a list of all of the items in the scene, called node_list. It also keeps track of the selected item. The render function on the scene simply calls render on each of the members of node_list.
Each type of Node defines its own behavior for rendering itself and for any other interactions. The Node keeps track of important data about itself: translation matrix, scale matrix, color, etc.
Rendering nodes is based on the transformation matrices that each node stores. The transformation matrix for a node is the combination of its scaling matrix and its translation matrix.
Using only primitives would be quite limiting for modelling applications. 3D models are generally made up of multiple primitives (or triangular meshes, which are outside the scope of this project).
By making the Node class extensible in this way, we can add new types of shapes to the scene without changing any of the other code for scene manipulation and rendering. Using the node concept to abstract away the fact that one Scene object may have many children is known as the Composite design pattern.
We accomplish rotation of the scene by using a trackball algorithm. The trackball is an intuitive interface for manipulating the scene in three dimensions. Conceptually, a trackball interface functions as if the scene was inside a transparent globe. Placing a hand on the surface of the globe and pushing it rotates the globe. Similarly, clicking the right mouse button and moving it on the screen rotates the scene.
Rotations are traditionally represented in one of two ways. The first is a rotation value around each axis; you could store this as a 3-tuple of floating point numbers. The other common representation for rotations is a quaternion, an element composed of a vector with x, y, and z coordinates, and a w rotation. Using quaternions has numerous benefits over per-axis rotation; in particular, they are more numerically stable. Using quaternions avoids problems like gimbal lock. The downside of quaternions is that they are less intuitive to work with and harder to understand.
To select an item, we use the current projection matrix to generate a ray that represents the mouse click, as if the mouse pointer shoots a ray into the scene. The selected node is the closest node to the camera with which the ray intersects. Thus the problem of picking reduced to the problem of finding intersections between a ray and nodes in the scene. So the question is: How do we tell if the ray hits a node?
Calculating exactly whether a ray intersects with a node is a challenging problem in terms of both complexity of code and of performance. We would need to write a ray-object intersection check for each type of primitive. For scene nodes with complex mesh geometries with many faces, calculating exact ray-object intersection would require testing the ray against each face and would be computationally expensive.
For the purposes of keeping the code compact and performance reasonable, we use a simple, fast approximation for the ray-object intersection test. In our implementation, each node stores an axis-aligned bounding box (AABB), which is an approximation of the space it occupies. To test whether a ray intersects with a node, we test whether the ray intersects with the node’s AABB.
This trade-off between complexity, performance, and accuracy is common in computer graphics and in many areas of software engineering.