HTML Basics

Video Lectures

All of our lectures are on YouTube. This is the best order for viewing them:

Examples from Videos

Required Reading

Notes

How Files Work Online

When writing HTML, the main page in any directory should be called index.html. This is a pre-configured default name that servers will look for when someone tries to access a directory online. That way, you can go to your website with just a directory name and see the file. For example: http://terpconnect.umd.edu/~jgolbeck will show the index.html in my main pub directory. If I make a sub-directory, say INST671, if I put a file in there called index.html, http://terpconnect.umd.edu/~jgolbeck/INST671 will show that index.html file. If I want to see another page, say "about.html", I need to put that page at the end of the URL, like this: http://terpconnect.umd.edu/~jgolbeck/INST671/about.html.

What is HTML?

HTML stands for HyperText Markup Language. A markup language is basically a set of markers that you insert into a text document in order to format it. In HTML, those markers are called "tags".

Learning HTML is best done through practice and example. If you see a site that you like, you will learn best by looking at the source code there. You can do this with the "View Source" or "Page Source" option in the edit menu of your browser. That will bring up all of the HTML code for the current document. Some pages will be too complicated to understand, especially on big professional sites, but it's worth looking.

Tags generally work in any browser. The W3C (World Wide Web Consortium) sets standards for HTML which work in any browser. Some browsers add their own features. Anything that is browser specific will not be covered in this class, and you should avoid using it because a big portion of the world won't be able to see your pages.

Tag Structure and Document Skeleton

With that said, let's look at the basic tag structure. Each tag is enclosed in a set of < and > symbols. Every tag also has a name. The name of the tag generally identifies it's function. For example, to start an html document, you need the html tag that tells the browser to interpret the code that follows:

<html>

Every tag has a corresponding end tag. That means, whatever attribute the tag started is then stopped. So, at the end of your document, there should be an end HTML tag to indicate the HTML is finished. Each end tag is also enclosed in < and > symbols. It has a / and then the name of the tag you are ending. Thus, to end your HTML document, you use

</html>

Let's look at some other basic tags. You must always start with the html tag and end with the /html tag. You have the option of inserting a header section into your document. This allows you to do some special functions. The only one we will be concerned with now is the "title" tag. This puts a title in the grey bar at the top of your browser window. On this page you're reading, the title is "INST671 - Web Programming". To make a header and Title, the code would look like this:

<html>
<head><title>This is my page title</title></head>
</html>

Notice that the actual title that you want to appear at the top of your document is sandwiched between a title and /title tag. No other tags can go in the title section; it has to be plain text.

The tags we've seen so far do not allow us to actually make a web page for people to see - all we can do is put up a title. To start inserting content, you need the "body" tag:

<body>

There is a corresponding end tag, and everything sandwiched in between is the body of the web page. The "body" tag also introduces us to the concept of attributes. Attributes are features that you may want to add to the thing you're creating with a tag. So, if we are creating the body, what attributes might we want to add? Since they are going in the body tag, they will generally affect the entire HTML document. Here are the ones you will probably be most interested in.

  • bgcolor - sets the background color for your document
  • background - tiles (repeats) an image for your background (no default). If you set this an a bgcolor, the image will go over the color. background gets set equal to the URL of the image. This can be an image from another site, or an image stored locally.

    • If the image is in the same directory as the file from which you are linking, you can set background equal to the file name

      <body background=bg.jpg>

    • If the image is hierarchically one step above the file from which you are linking, you can use ../ to indicate to go up one level

      <body background=../bg.jpg>

    • If the image is in a subdirectory (call it sub) of the directory that contains the file which you are creating, you include the subdirectory name in the src:
      <body background=sub/bg.jpg>
  • text - sets the text color (default is black)
  • link - sets the link color (default is blue)
  • vlink - sets the color for a link that has been visited (default is purple)
  • alink - sets the color for an active link ( a link you are in the process of clicking) (default is red)
That said, most of these attributes have fallen out of favor. Those styles are now more commonly done with Cascading Style Sheets. We will be covering those later in the semester. For now, it is ok to use these as we are learning HTML, but it is more proper in modern web design to encode stylistic attributes with CSS.

To add an attribute to a tag, you must first put the name as usual. Then, you add in the attribute set equal to it's value. This is called a name-value pair. The name of the attribute is set equal to the value of that attribute.

<body bgcolor="white">

You may include as many attributes as you like. None are required. They are also case sensitive and should be lower case, though this was not always true. Back in the 1990s, the standards allowed HTML to be case insensitive. Since we don't want all those old pages to break, browsers will support tags with any mix of cases. Similarly, the values in attributes should be in quotes (double or single work), but that was not always required. Thus, you should use them, but if you leave them off (and I sometimes will forget them because I've been doing this for so long), it will still work.

Notice that, in the example above, I used "white" as a value for the background color. There is a set list of color names that you can use at http://www.w3schools.com/html/html_colornames.asp.

A better way to do colors is with hex codes. This requires a bit of a tangent into color theory, but it's interesting and well worth learning.

<Color Tangent>

Colors are made of three colors of light - red, green, and blue. You can make any color by properly combining the right amounts of red, green, or blue. In most current computer systems, we use 256 gradients of each color - a value from 0 to 255. Thus, you can have any of those amounts of red, any of those of green, and any of blue. This lets you make nearly 17 million different colors. Here's how the combining works. Black is 0 red, 0 green, and 0 blue (no light). White is 255 red, 255 green, and 255 blue. If you have equal numbers of each color between 0 and 255 (like 100 red, 100 green, 100 blue) you get a shade of grey. If you have equal numbers of two colors and a higher amount of another color, you get some variant of the higher color (like 200 red, 200 green, and 250 blue gives a light blue color). You can combine colors - red and blue make purple, blue and green make blue-green, and red and green make yellow. It takes some practice to figure this system out, but with time it is easy to get the color you want.

For convenience sake, we represent colors as three numbers respectively giving the red, green, and blue values. So (255,180,0) gives an orangy color (255 red, 180 green, and 0 blue). With that much known, we can try to represent these colors in HTML. For example, if we want that orange color for the background, we might say bgcolor=255,180,0. However, HTML doesn't make it that easy. It works on a hex system, so each value is given in base 16 instead of base 10. To do that, we count like usual for 0 through 9, and then use letters to go higher: 0-1-2-3-4-5-6-7-8-9-a-b-c-d-e-f. This gives 16 values in one place. Once you get higher than 16, you carry over to the next place, and start again. So if we were to keep counting:
0-1-2-3-4-5-6-7-8-9-a-b-c-d-e-f-10-11-12-13-14-15-16-17-18-19-1a-1b-1c-1d-1e-1f-20
and so on. Counting this way, we go from 00 (for 0) to ff (for 255). So to represent our orange color, red is ff (for 255), green is b4 (remember 1 in hex is equal to 11 (because it is 2 above 9) * 16 (which is our base) 11*16 = 176. Then, we add on 4 to make it 180. b0 + 04 = b4. Blue is 00. Thus we have ff,b4,00. In HTML, we remove the comas, so the final hex value is ffb400.

Thankfully, you don't really need to learn this system. There are lot of tools out there that just let you pick your color from the spectrum and then gives you the hex value. Here is one of them. Note that you will sometimes see hex values with a # in front of them (#ffffff vs. ffffff - for white). Either is correct, and you can use which ever system you prefer.

</Color Tangent>

Line Breaks and Spacing

So now we put in a title and set the colors for our document. Once we have the body tag finished, we can start actually putting content into our webpage. You can just start adding text, but there are some things about HTML that won't let you get very far. The most noticeable is that HTML ignores white space. That means if you put two spaces in a row, HTML acts like there is just one. If you put a bunch of new lines in your code, HTML treats it like a single space. So, if you want to make paragraphs or if you want to have individual lines of information, you need tags to insert those line breaks.

The "br" tag puts in a single line break
so text shows up on the immediate next
line just like this. <BR> does not
have a corresponding end tag. The standards require all tags to have matching end tags, though so the way people get around this is to combine the tag with a / at the end. This is formally correct XML (the language that underlies HTML) that indicates the start and end of a tag: <br />. I'll often neglect to do this in my code because I've done it so long without the ending /, but if you want formally correct code, you should include it!

The <p> tag is a paragraph mark. it puts in a line break and a blank like like those that separate every paragraph in this document.

Finally, there is the <pre> tag. Pre stands for preformatted text, and anything that is enclosed in a pre tag will show up with all of the white space you include. The drawback is that it shows up in a monospaced font like this:

This
	Is	
			Pre
					Formatted
							t e x t .    .         .        .
										

Tags like

&lt;b&gt;, &lt;i&gt;, etc will still work in the pre tag. If you want to show tags as tags instead of interpreting them, you can use the <xmp> tag. XMP stand for eXaMPle. You can enclose some code in it that looks like this: <XMP> <b>Hello</b>! How are <u>you</u>? </XMP>

By default, text aligns to the left. There are a couple ways you can change that. The most widely used is the <center> tag. It has a corresponding end tag and you can use it to center anything you want (text with formatting, pictures, links, etc).

<center>This is centered</center>

You can also use the align attribute of the P tag. If you want to align a paragraph to the right, you can do <p align=right>. Alignment to the right will stop whenever your next P tag shows up, so there really isn't a need to end the P tag. CSS is now the preferred way to do alignment, but it is fine to use these for now.

Text Formatting

Once you have learned these tags, you can make a simple text document, much like you can in a plain text editor like Notepad. There is a lot more formatting that we want to do, and these are some of the tags to let you do that. Remember to use the corresponding end tags. If you leave them off, your formatting will apply to the whole document (you may have a big bold document when you really wanted one word bolded).
  • <b>
    The Bold Tag. Put it and the end tag around a word you want bolded:
    I <b>love</b> traveling.
    produces
    I love traveling.
  • <u>
    The Underline Tag. Works the same as bold.
  • <i>
    The Italic Tag. Works the same as Bold and Underline.
  • <sup>
    the super script tag.
    x<sup>2</sup>
    produces
    x2
  • <sub>
    The subscript tag. Works the same as the sup tag.
  • <font>
    The font tag is somewhat complex. It allows you to do a lot of things to adjust the appearance of your text. It has a group of attributes.
    • color - changes the color of your font based on what you set this to.

      <font color="orange">Pumpkin</font>
      produces
      Pumpkin

    • size - this can be set to a number (like size="3") or to a plus or minus value (size="+2"). The default number size is 3. I prefer using the plus and minus values. Obviously, plus values make your text larger (a plus one) or smaller (a minus 3).
    • face - you can use this to set the typeface of your text. The hitch is that the user must have that font on their computer. A few fonts are generally safe. Arial, courier, and times usually work for everyone. <font face="courier"> will make your text look like this </font>. Try not to use exotic fonts because the user generally won't be able to see them. Arial, Times, and Courier are built in to pretty much every browser, but other fonts aren't. Your viewer has to have the font face installed on their system to match the one that you've specified in the font tag. If you use a weird font that you have on your system, it won't show up on anyone else's system if they don't also have the weird font.
    Note that in all of the above examples the end tag is </font> regardless of the attributes. The end tag never has any attributes included. It is always just the name of the tag preceded by a slash. This is true for all tags, not just the font tag.

Finally, there are the Header tags. These allow you to put headings (bold text of different sizes) into your document. There are sized from 1 to 6.

  • <h1>This is h1</h1>

  • <h2>This is h2</h2>

  • <h3>This is 3</h3>

  • <h4>This is h4</h4>

  • <h5>This is h5</h5>
  • <h6>This is h6</h6>
You can put other tags inside a header tag (like paragraph marks, underlines, etc).

Other Formatting

There are other things you can put in your document to help formatting.
  • Horizontal Rule <hr>
    The horizontal rule has no end tag. It puts a line across your page like this


    It has a couple interesting attributes.
    • Size - this controls the height of your rule in pixels. For example, <hr size=10> would make a rule that looks like this


    • Width - The width attribute controls how long the rule is (in pixels). Here's a rule with width=150


      It can also be set to a percent, and that will make your rule the corresponding percent wide of whatever container it is in. For example, this text is currently indented (which we will learn how to do later). By default, the rule will be the full width of this text area:
      but we can make it only a percentage: <hr width=50%>
      The rule is always centered.

    These attributes can be used alone or together.

  • Special Characters
    The other special formatting attribute to show in this class are special characters. For example, if you want to write a document like this one which explains html, you need to show tags. How do you get a tag like <b> to show up in that form? If you just type that in, the browser will read it as though you are actually starting a bold tag. What we need is a special way to say "put a less than sign on the screen" and we can't do that by just typing a less than sign. Thus there is a special character for that: &lt;. This special encoding says "display a <". > is done &gt;

    Now let's say you want to write the paragraph i just typed above to tell your students to type &gt;. How do you get that to show up on the screen? You can't just type it, because if you do it will show a >. So, you need a special character string for the & which is &amp;. Quiz: how do you do &amp;??

    There are a bunch of these special characters. The ones you need to know are the three listed above, and this one &nbsp;. This is a non-breaking space. When I said that HTML ignores white space, it really does it to the extreme. Not only won't it recognize a bunch of spaces in a row or a bunch of returns in a row, but it won't even recognize multiple <p> tags. If you put 5 <p> tags, it will only put in one blank line, as though you typed just one. If you want to get a lot of white space, you need to separate your spaces or your <p> or <br> tags with something. You can use this &nbsp; to do that.

    To get this white space:

     

     

     

    We type <p>&nbsp;<p>&nbsp;<p>&nbsp;<p>
    Take a look at the source code for this document if you're curious.

Comments

The final useful item for the day is comments. You can comment out sections of HTML code to check for errors, or to temporarily remove them from showing up. The comment is really one big tag that contains the information you want to hide. You start with <!--, followed by the text to hide, and end with --> when you're done hiding:

&lt;html&gt; &lt;body bgcolor="white"&gt; This is nice text. &lt;!-- But I want to hide this part !! &gt;:| --&gt;