3D Graphics Animation in Computer Graphics
by
Department of Computer Science
University of Maryland at College Park
Contents
This paper describes several techniques used in 3D graphics implementation.
These techniques are used to implement my 3D animation program written in
JAVA. Such techniques include Coordinator Frame Transformation, Perspective
Transformation, 3D Volumn Clipping, Hidden Surface/Line Removal and Ray Tracing.
Since this is for 3D space, each point/vector should have 3 element, one for
each direction of the 3D space. However, this is a subtle way to represent
a 3D space. For example, give an object A = (x, y, z), this object A could be
either a point at (x, y, z) position or a vector that points to (x, y, z)
direction. There is no way to distinguish them.
A better way to do it is to append an extra element at the end of the array.
For scalers, the 4th element will be 1, for vectors, the 4th element will be 0.
For example, A = (x, y, z, 0) indicates that A is a vector, and B = (x, y, z, 1)
indicates that B is a scaler/point.
We know that if a point subtracting a point will become a vector, a vector
add a scale will still be a vector. We can see that these operations still
hold true for the above representation. There are also some illegal operations,
such as adding to scalers. These operation can also be checked if using
the above representation.
Therefore, you can see the nice property about this particular representation.
Every legal operation to be performed on points and vectors, can be
implemented by performing the corresponding arithmetic operation on their
coordinate lists.
- if the last element is "1": Result is a Point
- if the last element is "0": Result is a Vector
- else: This was a illegal operation
Take a look of the source code for detailed
implementation on vectors.
Matrix implementation is a powerful technique used for computer graphics.
For the concerns stated in section II, matrix for 3D space is represented
as a 4X4 array instead of 3X3 array.
Each operation of object such as translation and rotation can be represented
by one Matrix. By multiplying the matrix, a vector/point can be transformed
by the operation. For example, P = (x, y, z, 1) is a point's coordinator before
the operation, and M = (1 0 0 a)(0 1 0 b)(0 0 1 c) (0 0 0 1) is the matrix
representation of translating the object a blocks right, b blocks up and c
blocks forward, then after the translating, the point's coordinator will become
P' = P*M is the coordinator of the point after the operation.
A sequence of operation can also be composed to be one operation. For example,
matrix M1 represent first operation, matrix M2 represent second operation,
the matrix M = M1*M2 will represent this sequence of 2 operations. If P = (x,
y, z, 1) is the coordinator of the point before perform the sequence of
the operations, then P' = P * M will be the coordinator of the point after
the perform the operation1 then perform the operation2.
Following are matrix representation for some operations:
- Translate object by V, where V is the vector:
M = (1 0 0 V[0]) (0 1 0 V[1]) (0 0 1 V[2]) (0 0 0 1)
- Rotate object along X axis by d degree:
M = (1 0 0 0) (0 cos(d) -sin(d) 0) (0 sind(d) cos(d) 0) (0 0 0 1)
- Rotate object along Y axis by d degree:
M = (cos(d) 0 sin(d) 0) (0 1 0 0) (-sin(d) 0 cos(d) 0) (0 0 0 1)
- Rotate object along Z axis by d degree:
M = (cos(d) -sin(d) 0 0) (sin(d) cos(d) 0 0) (0 0 1 0) (0 0 0 1)
Take a look of the source code for more matrix
representation of operations.
Cartesian coordinator frame is represented by an origin and three vectors.
Usually, for a common coordinator frame, an origin in 3D space is (0, 0, 0)
and the three vectors are three following unit vectors:
- X (1, 0, 0): points up
- Y (0, 1, 0): points right
- Z (0, 0, 1): points forword
However, in general, we can think of any three linear
independent vector and an arbitrary point to define a coordinator frame.
We want points and vectors to be represented with respect to some common
coordinate frame. But users would like to define points and vectors
relative to whatever coordinator frame is most convenient. In a complex
model, there may be many of local coordinator frame that are used to define
many geometric objects. For example, in my project, there are three types
of coordinator frame:
- The world coordinator frame, which is the common coordinator frame.
- The viewer frame coordinator frame, which is a coordinator frame
relative to the camera's point of view. This frame will be explained in later
section.
- The viewer window (canvas) coodinator frame. This uses standard
canvas board coordinator frame. It's origin is at left upper corner, and
x direction points to right, y direction points downward.
Consider the general problem, that a user has defined points and vectors
relative to his favorite frame(G), and wants to know what their representation
is relative to some standard frame(F)? How can we perform this transformation?
The answer is through matrix multiplication. As explained in section III, in
particular, we can define a matrix M which maps points represented in G's
coordinator frame to their representation in F's coordinator frame, such that
for any geometric object v (point or vector) we have v[F] = v[G]*M.
What is M? Let G.O[F] be the origin of the frame G relative to frame F, and let
G.b0[F], G.b1[F], G.b2[F] be the basis vectors of frame G relative to frame F.
Then M = (G.b0[F]) (G.b1[F]) (G.b2[F]) (G.O[F]). By using above matrix
multiplication, we can transform a coordinator which relative to frame G to
a coordinator which relative to frame F. By the same token, v[G] = v[F] *
M', where M' is the inverse matrix for M. And now we can also transform
coordinator from relative to frame F to coordinator relative to frame G.
The viewer frame coordinator system will be explained in the next section.
Usually, we consider a photograph device such as camera as a viewer frame.
A viewer frame is defined by the following pieces of information:
- Center of Projection(COP): This is a point in 3D space. It is
usually where human eye is. This point is also considered as the origin of
the viewer frame.
- Viewing Window: This is the window where the 3D object will project
onto. It is also represented as a canvas board on the applet. It is defined
by the following 3 variables:
- Viewing Distance(VD): This is the distance from the COP to
the view window.
- Viewing Window Width(VW): This is the width of the viewing window.
- Viewing Window Height(VH): This is the height of the viewing window.
- Viewing Direction (Z-basis) Vector(VFZ): It is a vector to which the
projection plane is orthogonal. It's magnitude is VD/100. Therefore, it
is define as (0, 0, VD/100).
- X-basis Vector(VFY): It is a vector points to the rignt direction.
It's magnitude is (VW/2)/100. Therefore, it is define as
((VW/2)/100, 0, 0).
- Y-basis Vector(VFY): It is a vector points to the up direction. It's
magnitude is (VH/2)/100. Therefore, it is define as (0, (VH/2)/100, 0)
.
Among the above information, COP, VFX, VFY
, VFZ are variable. They change
when the photograph device(camera) translates or rotates. F,
B, VD, VW, VH are static,
they are fixed with the photograph device and never changes.
From the above information, we can caculate two matrix that can transform the
the world coordinator system to the viewer frame coordinator system and vice
versa. These two matrix needs to be calculated every time the photograph device
moves.
- matrix F2W_Matrix: The matrix that transform viewer frame
coordinator to world coordinator. It equals
(VFX[0], VFX[1], VFX[2], VFX[3])
(VFY[0], VFY[1], VFY[2], VFY[3])
(VFZ[0], VFZ[1], VFZ[2], VFZ[3])
(COP[0], COP[1], COP[2], COP[3])
- matrix W2F_Matrix: The matrix that transform world frame
coordinator to viewer frame coordinator. It equals the inverse of
F2W_Matrix
There are two other static matrix that transforms the viewer frame coordinator
to the screen coordinator.
- matrix F2S_Matrix: The matrix that transform viewer frame
coordinator to screen coordinator. It equals:
((VW/2)/100, 0, 0, 0)
(0, -(VH/2)/100, 0 0)
(0, 0, 1, 0)
(VW/2, VH/2, 0, 1)
- matrix S2F_Matrix: The matrix that transform screen coordinator
to viewer frame coordinator. It is the inverse of F2S_Matrix. It
equals:
(200/VW, 0, 0, 0)
(0, -200/VH, 0, 0)
(0, 0, 1, 0)
(-100, 100, 0, 1)
Take a look of the source code for detail
implementation.
VI. 3D Volume Clipping
We have defined the viewer frame in above section. It is a 3D coordinate frame
whose origin is the eye (COP), the z-axis points in the view direction,
whose x-axis is oriented to the right of the viewer, and whose y-axis points up
relative to the viewer. In addition to the view frame, it is often convenient to
add two additional pieces of information:
- Hither (or near) clipping plane: This is a plane orthogonal to the viewing
direction and in front of eye. Objects closer than the plane will be clipped out.
The distance from COP to this plane is denoted as F.
- Yon (or far) clipping plane: This is a plane orthogonal to the viewing
direction and in front of eye. Objects further than the plane will be clipped out.
The distance from COP to this plane is denoted as B.
The hither and yon clipping planes together with the extensions of the 4 sides of
the window form a 6-sided truncated pyramid, called the viewing volume.
Only the object lying within the viewing volumn are displayed. When the viewing
volume is represented with respect to the viewer frame, it is called
canonical view volume.
The method we are using for clipping is a generalization of Liang Barsky
algorithm. First, we consider how planes are represented. We know that each plane
divide the 3D space to two halfsapce. Each halfsapce can be defined by two
items: one point that lies on the plane, and an normal vector that orthogonal
to the plane and points to the halfspace side. If a point that is not within
the canonical view volume, in another words, if a point does not belong to
one of those 6 halfspace, it is clipped out of the volume.
What are those 6 clipping plane (halfspace)? As what we defined for the view window
and the viewer frame 3 unit vectors in sector IV, we can find out those 6 clipping
halfspace representation are:
- Hither Plane:
reference point (RP[0]): (0, 0, F)
normal vector (NV[0])): (0, 0, 1)
- Yon Plane:
reference point (RP[1]): (0, 0, B)
normal vector (NV[1]): (0, 0, -1)
- Top Plane:
reference point (RP[2]): (0, 0, 0)
normal vector (NV[2]): (0, -1, 1)
- Bottom Plane:
reference point (RP[3]): (0, 0, 0)
normal vector (NV[3]): (0, 1, 1)
- Left Plane:
reference point (RP[4]): (0, 0, 0)
normal vector (NV[4]): (1, 0, 1)
- Right Plane:
reference point (RP[5]): (0, 0, 0)
normal vector (NV[5]): (-1, 0, 1)
Now, we have all the halfspace defined. Consider a segment S = (P0, P1) and a
halfspace H = (RP, NV), how do we determine whether the plane clips the segment,
and if so, where this happens?
Consider two vectors V0 = P0 - RP, and V1 = P1 - RP. S crosses the clipping
plane only if V0 and V1 points to different direction relative to NV. In other
words, this is ture if the dot products s0 = V0.NV and s1 = V1.NV
have opposite signs. (The negative sign cooresponds to the point that is to be
clipped.) In particular, the fraction value at which the segment intersects the
plane is f = |s0| / (|s0| + |s1|). And the intersection point P can be
computed as P = P0 + f * (P1 - P0).
Take a look of the source code for clipping detail
implementation.
Perspective transormation is used to map n+1 dimension object onto n dimension
space. In this case, we are mapping 3D object onto 2D space. According to
viewer frame's coordinator system, all the points of the object will be
mapped onto the view window whose z value is VD. Therefore,
the point in 3D space (X, Y, Z, 1) is now transformed to be
(X*VD/Z, Y*VD/Z, VD, 1). To preserve the real depth
of the point, we can change the transformation to be
(X*VD/Z, Y*VD/Z, -1/Z, 1).
Since we do not want Z to be 0, we will need to do the clipping first then do
the perspective transformation. The clipping procedure will clip out all the
point outside the viewing volume including the points with Z = 0.
For a convex object, one of the simple way for removing hidden surface is using
back-face culling method.
First, we compute each face's outpointing normal vector. This can be done by
cross producting two vectors on the face. For exapme, V0 and V1 are two vectors
on the face, then the normal vector NV of the face is NV = V0*V1.
After computing the normal vectors, we observe that if the normal vector is
directing away from the view point, then the face is the hidden face and will
not be shown on view window. The view point is at the origin which is eye
(COP). Again, we can use the dot product to determine this
situation. Consider a face with normal vector NV and a referenc point RP (this
could be any arbitray point on the face), a reference vector RV is define by
RV = RP - COP. The dot product can be computed as NV.RV. If this
dot product is positive (NV directs toward view point), then the face will be shown
on view window. Else (NV directs away from view point), then the face is hidden,
and will not shown on the view window.
For wireframe object, the hidden line determination is done in the similar way.
If a line does not belong to any of the visible faces, then this line is hidden.
Take a look of the source code for detail
implementation.
In this project, we assume that when light hits the surface, it is scattered
equally in all directions. If the surface is facing the light source, then
the energy is spread over the smallest possible area, and thus this part of
the surface appears brightest. As the angle of the surface normal increases
with respect to the angle of the light source, then an equal among of the
light's energy is spread out over a greater fraction of the surface, and
hence each point of the surface receives, and hence reflects a smaller amount
of light.
Therefore, we can define the variable brightness as following:
brightness = cos(angle), where angle is the angle between the
normal vector of the surface and the light direction. Suppose the color model
of the face is (R, G, B), then the color show on the screen should
be (R*brightness, G*brightness, B*brightness).
For the reason that
we do not want to lost the object's true color even when it's very dark, we
recalulate the color in the following way:
(R/2 + R/2 * brightness, G/2 + G/2 * brightness, B/2 + B/2 * brightness)
Each 3D object is represented by following items:
- points: It an object of Vertex class, which is a subclass of
Vector. It contains following items:
- OutOfRange: a flag that indicates whether this point is outside
of the view volumn or not.
- element: a 4-element array for vector representation.
Take a look of the source code for detailed
implementation.
- lines: It is an object of Line class, which contains following
items:
- OutOfRange: a flag that indicates whether this line is totally
outside of the view volumn or not.
- start: the starting point index(starting from 0)
- end: the ending point index(starting from 0)
Take a look of the source code for detailed
implementation.
- faces: It is an object of Face class, which contains following
items:
- behind: a flag that indicates whether this face is visible
or invisible.
- line: an array of all the lines enclose this face. Lines are
ordered clockwise. Each line is indicated by it's line index (starting from
1). If a line value is negative, it means, to make this line clockwise,
the line's start and end point needs to be reversed.
- numLines: total number of lines enclosing this face.
- color: for surface model with colored face, this variable
indicates the color of the face. for surface model with light source, this
variable indicates that the color of the face that will shown on the screen
according to it's brightness.
Take a look of the source code for detailed
implementation.
- color: for surface model with light source, this variable
indicates the color of this object.
Take a look of the source code for detailed
3D object implementation.
The following is the procedure of this project:
- Transform the object in 3D (world coordinator frame) space according to
it's movement (translation, rotation, spinning).
- Transform the object from world coordinator frame to viewer frame.
- Determine hidden surface.
- Perform object clipping according to the canonical view volumn.
- Perform perspective transfromation, map the 3D object to view window.
- Transform the view frame coordinator to screen coordinator system.
- Paint the object onto screen (canvas).
The above procedures need to be performed everytime the object is moved or
the camera is moved.
This is a very interesting project. The animation looks very nice with
the double buffering technique. However, the project is only limited to
convex object. For a non convex object, the techiniques used for hidden
surface removal is more complex. I would like to implement it in the future
if time permits. I would also implement a object with curved face. In this
case, each pixel on a face will have it's own brightness. I will have to
implement the object pixel by pixel. This woulbe be a more complicated
project and animation might not work because the complex computation involved.
Thanks for the advising by
Dr. Dave Mount of Dept. of Computer Science, University of Maryland
at College Park.
Back to the 3D applet.