3D Graphics Animation in Computer Graphics

by

Connie K. Peng

Department of Computer Science
University of Maryland at College Park

I. Introduction
II. Vector Representation
III. Matrix Representation
IV. Coordinator Frame
V. Viewer Frame
VI. 3D Volume Clipping
VII. Perspective Transformation
VIII. Hidden Surface/Line Removal
IX. Light and Color
X. Data Structure Used
XI. Putting All Together
XII. Conclusion
XIII. Resources
IXV. Acknowlegement

This paper describes several techniques used in 3D graphics implementation. These techniques are used to implement my 3D animation program written in JAVA. Such techniques include Coordinator Frame Transformation, Perspective Transformation, 3D Volumn Clipping, Hidden Surface/Line Removal and Ray Tracing.

II. Vector Representation

Since this is for 3D space, each point/vector should have 3 element, one for each direction of the 3D space. However, this is a subtle way to represent a 3D space. For example, give an object A = (x, y, z), this object A could be either a point at (x, y, z) position or a vector that points to (x, y, z) direction. There is no way to distinguish them.

A better way to do it is to append an extra element at the end of the array. For scalers, the 4th element will be 1, for vectors, the 4th element will be 0. For example, A = (x, y, z, 0) indicates that A is a vector, and B = (x, y, z, 1) indicates that B is a scaler/point.

We know that if a point subtracting a point will become a vector, a vector add a scale will still be a vector. We can see that these operations still hold true for the above representation. There are also some illegal operations, such as adding to scalers. These operation can also be checked if using the above representation.

Therefore, you can see the nice property about this particular representation. Every legal operation to be performed on points and vectors, can be implemented by performing the corresponding arithmetic operation on their coordinate lists.

if the last element is "1": Result is a Point
if the last element is "0": Result is a Vector
else: This was a illegal operation

Take a look of the source code for detailed implementation on vectors.

III. Matrix Representation

Matrix implementation is a powerful technique used for computer graphics. For the concerns stated in section II, matrix for 3D space is represented as a 4X4 array instead of 3X3 array.

Each operation of object such as translation and rotation can be represented by one Matrix. By multiplying the matrix, a vector/point can be transformed by the operation. For example, P = (x, y, z, 1) is a point's coordinator before the operation, and M = (1 0 0 a)(0 1 0 b)(0 0 1 c) (0 0 0 1) is the matrix representation of translating the object a blocks right, b blocks up and c blocks forward, then after the translating, the point's coordinator will become P' = P*M is the coordinator of the point after the operation.

A sequence of operation can also be composed to be one operation. For example, matrix M1 represent first operation, matrix M2 represent second operation, the matrix M = M1*M2 will represent this sequence of 2 operations. If P = (x, y, z, 1) is the coordinator of the point before perform the sequence of the operations, then P' = P * M will be the coordinator of the point after the perform the operation1 then perform the operation2.

Following are matrix representation for some operations:

Translate object by V, where V is the vector:
M = (1 0 0 V[0]) (0 1 0 V[1]) (0 0 1 V[2]) (0 0 0 1)
Rotate object along X axis by d degree:
M = (1 0 0 0) (0 cos(d) -sin(d) 0) (0 sind(d) cos(d) 0) (0 0 0 1)
Rotate object along Y axis by d degree:
M = (cos(d) 0 sin(d) 0) (0 1 0 0) (-sin(d) 0 cos(d) 0) (0 0 0 1)
Rotate object along Z axis by d degree:
M = (cos(d) -sin(d) 0 0) (sin(d) cos(d) 0 0) (0 0 1 0) (0 0 0 1)

Take a look of the source code for more matrix representation of operations.

IV. Coordinator Frame

Cartesian coordinator frame is represented by an origin and three vectors. Usually, for a common coordinator frame, an origin in 3D space is (0, 0, 0) and the three vectors are three following unit vectors:

X (1, 0, 0): points up
Y (0, 1, 0): points right
Z (0, 0, 1): points forword

However, in general, we can think of any three linear independent vector and an arbitrary point to define a coordinator frame. We want points and vectors to be represented with respect to some common coordinate frame. But users would like to define points and vectors relative to whatever coordinator frame is most convenient. In a complex model, there may be many of local coordinator frame that are used to define many geometric objects. For example, in my project, there are three types of coordinator frame:

The world coordinator frame, which is the common coordinator frame.
The viewer frame coordinator frame, which is a coordinator frame relative to the camera's point of view. This frame will be explained in later section.
The viewer window (canvas) coodinator frame. This uses standard canvas board coordinator frame. It's origin is at left upper corner, and x direction points to right, y direction points downward.

Consider the general problem, that a user has defined points and vectors relative to his favorite frame(G), and wants to know what their representation is relative to some standard frame(F)? How can we perform this transformation?

The answer is through matrix multiplication. As explained in section III, in particular, we can define a matrix M which maps points represented in G's coordinator frame to their representation in F's coordinator frame, such that for any geometric object v (point or vector) we have v[F] = v[G]*M.

What is M? Let G.O[F] be the origin of the frame G relative to frame F, and let G.b0[F], G.b1[F], G.b2[F] be the basis vectors of frame G relative to frame F. Then M = (G.b0[F]) (G.b1[F]) (G.b2[F]) (G.O[F]). By using above matrix multiplication, we can transform a coordinator which relative to frame G to a coordinator which relative to frame F. By the same token, v[G] = v[F] * M', where M' is the inverse matrix for M. And now we can also transform coordinator from relative to frame F to coordinator relative to frame G.

The viewer frame coordinator system will be explained in the next section.

V. Viewer Frame

Usually, we consider a photograph device such as camera as a viewer frame. A viewer frame is defined by the following pieces of information:

Center of Projection(COP): This is a point in 3D space. It is usually where human eye is. This point is also considered as the origin of the viewer frame.
Viewing Window: This is the window where the 3D object will project onto. It is also represented as a canvas board on the applet. It is defined by the following 3 variables:
- Viewing Distance(VD): This is the distance from the COP to the view window.
- Viewing Window Width(VW): This is the width of the viewing window.
- Viewing Window Height(VH): This is the height of the viewing window.
Viewing Direction (Z-basis) Vector(VFZ): It is a vector to which the projection plane is orthogonal. It's magnitude is VD/100. Therefore, it is define as (0, 0, VD/100).
X-basis Vector(VFY): It is a vector points to the rignt direction. It's magnitude is (VW/2)/100. Therefore, it is define as ((VW/2)/100, 0, 0).
Y-basis Vector(VFY): It is a vector points to the up direction. It's magnitude is (VH/2)/100. Therefore, it is define as (0, (VH/2)/100, 0) .

Among the above information, COP, VFX, VFY , VFZ are variable. They change when the photograph device(camera) translates or rotates. F, B, VD, VW, VH are static, they are fixed with the photograph device and never changes.

From the above information, we can caculate two matrix that can transform the the world coordinator system to the viewer frame coordinator system and vice versa. These two matrix needs to be calculated every time the photograph device moves.

matrix F2W_Matrix: The matrix that transform viewer frame coordinator to world coordinator. It equals
(VFX[0], VFX[1], VFX[2], VFX[3])
(VFY[0], VFY[1], VFY[2], VFY[3])
(VFZ[0], VFZ[1], VFZ[2], VFZ[3])
(COP[0], COP[1], COP[2], COP[3])
matrix W2F_Matrix: The matrix that transform world frame coordinator to viewer frame coordinator. It equals the inverse of F2W_Matrix

There are two other static matrix that transforms the viewer frame coordinator to the screen coordinator.

matrix F2S_Matrix: The matrix that transform viewer frame coordinator to screen coordinator. It equals:
((VW/2)/100, 0, 0, 0)
(0, -(VH/2)/100, 0 0)
(0, 0, 1, 0)
(VW/2, VH/2, 0, 1)
matrix S2F_Matrix: The matrix that transform screen coordinator to viewer frame coordinator. It is the inverse of F2S_Matrix. It equals:
(200/VW, 0, 0, 0)
(0, -200/VH, 0, 0)
(0, 0, 1, 0)
(-100, 100, 0, 1)

Take a look of the source code for detail implementation.

VI. 3D Volume Clipping

We have defined the viewer frame in above section. It is a 3D coordinate frame whose origin is the eye (COP), the z-axis points in the view direction, whose x-axis is oriented to the right of the viewer, and whose y-axis points up relative to the viewer. In addition to the view frame, it is often convenient to add two additional pieces of information:

Hither (or near) clipping plane: This is a plane orthogonal to the viewing direction and in front of eye. Objects closer than the plane will be clipped out. The distance from COP to this plane is denoted as F.
Yon (or far) clipping plane: This is a plane orthogonal to the viewing direction and in front of eye. Objects further than the plane will be clipped out. The distance from COP to this plane is denoted as B.

The hither and yon clipping planes together with the extensions of the 4 sides of the window form a 6-sided truncated pyramid, called the viewing volume. Only the object lying within the viewing volumn are displayed. When the viewing volume is represented with respect to the viewer frame, it is called canonical view volume.

The method we are using for clipping is a generalization of Liang Barsky algorithm. First, we consider how planes are represented. We know that each plane divide the 3D space to two halfsapce. Each halfsapce can be defined by two items: one point that lies on the plane, and an normal vector that orthogonal to the plane and points to the halfspace side. If a point that is not within the canonical view volume, in another words, if a point does not belong to one of those 6 halfspace, it is clipped out of the volume.

What are those 6 clipping plane (halfspace)? As what we defined for the view window and the viewer frame 3 unit vectors in sector IV, we can find out those 6 clipping halfspace representation are:

Hither Plane:
reference point (RP[0]): (0, 0, F)
normal vector (NV[0])): (0, 0, 1)
Yon Plane:
reference point (RP[1]): (0, 0, B)
normal vector (NV[1]): (0, 0, -1)
Top Plane:
reference point (RP[2]): (0, 0, 0)
normal vector (NV[2]): (0, -1, 1)
Bottom Plane:
reference point (RP[3]): (0, 0, 0)
normal vector (NV[3]): (0, 1, 1)
Left Plane:
reference point (RP[4]): (0, 0, 0)
normal vector (NV[4]): (1, 0, 1)
Right Plane:
reference point (RP[5]): (0, 0, 0)
normal vector (NV[5]): (-1, 0, 1)

Now, we have all the halfspace defined. Consider a segment S = (P0, P1) and a halfspace H = (RP, NV), how do we determine whether the plane clips the segment, and if so, where this happens?

Consider two vectors V0 = P0 - RP, and V1 = P1 - RP. S crosses the clipping plane only if V0 and V1 points to different direction relative to NV. In other words, this is ture if the dot products s0 = V0.NV and s1 = V1.NV have opposite signs. (The negative sign cooresponds to the point that is to be clipped.) In particular, the fraction value at which the segment intersects the plane is f = |s0| / (|s0| + |s1|). And the intersection point P can be computed as P = P0 + f * (P1 - P0).

Take a look of the source code for clipping detail implementation.

VII. Perspective Transformation

Perspective transormation is used to map n+1 dimension object onto n dimension space. In this case, we are mapping 3D object onto 2D space. According to viewer frame's coordinator system, all the points of the object will be mapped onto the view window whose z value is VD. Therefore, the point in 3D space (X, Y, Z, 1) is now transformed to be (X*VD/Z, Y*VD/Z, VD, 1). To preserve the real depth of the point, we can change the transformation to be (X*VD/Z, Y*VD/Z, -1/Z, 1).

Since we do not want Z to be 0, we will need to do the clipping first then do the perspective transformation. The clipping procedure will clip out all the point outside the viewing volume including the points with Z = 0.

VIII. Hidden Surface/Line Removal

For a convex object, one of the simple way for removing hidden surface is using back-face culling method.

First, we compute each face's outpointing normal vector. This can be done by cross producting two vectors on the face. For exapme, V0 and V1 are two vectors on the face, then the normal vector NV of the face is NV = V0*V1.

After computing the normal vectors, we observe that if the normal vector is directing away from the view point, then the face is the hidden face and will not be shown on view window. The view point is at the origin which is eye (COP). Again, we can use the dot product to determine this situation. Consider a face with normal vector NV and a referenc point RP (this could be any arbitray point on the face), a reference vector RV is define by RV = RP - COP. The dot product can be computed as NV.RV. If this dot product is positive (NV directs toward view point), then the face will be shown on view window. Else (NV directs away from view point), then the face is hidden, and will not shown on the view window.

For wireframe object, the hidden line determination is done in the similar way. If a line does not belong to any of the visible faces, then this line is hidden.

Take a look of the source code for detail implementation.

IX. Light and Color

In this project, we assume that when light hits the surface, it is scattered equally in all directions. If the surface is facing the light source, then the energy is spread over the smallest possible area, and thus this part of the surface appears brightest. As the angle of the surface normal increases with respect to the angle of the light source, then an equal among of the light's energy is spread out over a greater fraction of the surface, and hence each point of the surface receives, and hence reflects a smaller amount of light.

Therefore, we can define the variable brightness as following: brightness = cos(angle), where angle is the angle between the normal vector of the surface and the light direction. Suppose the color model of the face is (R, G, B), then the color show on the screen should be (R*brightness, G*brightness, B*brightness).

For the reason that we do not want to lost the object's true color even when it's very dark, we recalulate the color in the following way:
(R/2 + R/2 * brightness, G/2 + G/2 * brightness, B/2 + B/2 * brightness)

X. Data Structure Used

Each 3D object is represented by following items:

points: It an object of Vertex class, which is a subclass of Vector. It contains following items:
- OutOfRange: a flag that indicates whether this point is outside of the view volumn or not.
- element: a 4-element array for vector representation.
Take a look of the source code for detailed implementation.
lines: It is an object of Line class, which contains following items:
- OutOfRange: a flag that indicates whether this line is totally outside of the view volumn or not.
- start: the starting point index(starting from 0)
- end: the ending point index(starting from 0)
Take a look of the source code for detailed implementation.
faces: It is an object of Face class, which contains following items:
- behind: a flag that indicates whether this face is visible or invisible.
- line: an array of all the lines enclose this face. Lines are ordered clockwise. Each line is indicated by it's line index (starting from 1). If a line value is negative, it means, to make this line clockwise, the line's start and end point needs to be reversed.
- numLines: total number of lines enclosing this face.
- color: for surface model with colored face, this variable indicates the color of the face. for surface model with light source, this variable indicates that the color of the face that will shown on the screen according to it's brightness.
Take a look of the source code for detailed implementation.
color: for surface model with light source, this variable indicates the color of this object.

Take a look of the source code for detailed 3D object implementation.

XI. Putting All Together

The following is the procedure of this project:

Transform the object in 3D (world coordinator frame) space according to it's movement (translation, rotation, spinning).
Transform the object from world coordinator frame to viewer frame.
Determine hidden surface.
Perform object clipping according to the canonical view volumn.
Perform perspective transfromation, map the 3D object to view window.
Transform the view frame coordinator to screen coordinator system.
Paint the object onto screen (canvas).

The above procedures need to be performed everytime the object is moved or the camera is moved.

XII. Conclusion

This is a very interesting project. The animation looks very nice with the double buffering technique. However, the project is only limited to convex object. For a non convex object, the techiniques used for hidden surface removal is more complex. I would like to implement it in the future if time permits. I would also implement a object with curved face. In this case, each pixel on a face will have it's own brightness. I will have to implement the object pixel by pixel. This woulbe be a more complicated project and animation might not work because the complex computation involved.

XIII. Resources

Introduction to Computer Graphics by James D. Foley, Andries van Dam, Steven K. Feiner, John F. Hughes, Richard L. Phillips.
Computer Graphics Class Notes by Dave Mount, Universeity of Maryland at College Park.

IXV. Acknowlegement

Thanks for the advising by Dr. Dave Mount of Dept. of Computer Science, University of Maryland at College Park.

Back to the 3D applet.

3D Graphics Animation in Computer Graphics

by

Contents

VI. 3D Volume Clipping