+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 16

Thread: Web Writing Tips - Cleaning HTML

  1. #1
    Join Date
    Dec 2004
    Location
    Nanaimo, BC, Canada
    Posts
    6,810

    Default Web Writing Tips - Cleaning HTML

    This post may seem limited right now, because I can only think of so many issues off the top of my head. But if people co-operate, we'll be able to update/edit this post as time goes on. It might get pretty big!

    Keep code as short as possible

    Search engines don't like it when you have a lot of unnecessary code. Whether it's because they're html snobs, or simply because they stop indexing after a certain amount of characters, it remains that your code should be shorter rather than longer. So, this thread should be of use not only for those who are writing for the web, but also for those who receive copy in the form of Word documents (or the like), and want to clean up its html before pasting it into the CMS.

    Extraneous Bits of Code

    <p><em>Italics</em><em> make things kinda hard to read.</em></p>

    See how the </em><em> is completely pointless? Nuke it.
    Or you might find this:

    <p>Gotta love those
    image poems whose
    shapes are relevant to the subject.</p>

    ...might as well put that all on the same line:
    <p>Gotta love those image poems whose shapes are relevant to the subject.</p>

    And there is always this:

    <p>
    And what rough beast, its hour come round at last
    </p>

    They can share the line:
    <p>And what rough beast, its hour come round at last</p>

    Here's another case:

    <p align="left">Slouches towards Bethlehem to be born. </p>

    In most cases, this is redundant, since the sentence should be defaulting to the left anyway. Unless some definition somewhere is saying that things should be aligned in some other direction, your text will be aligned left anyway.

    The same applies for <a href="/mustard-bath.php target="_self"> You usually don't need the "target="_self" because it's the default anyway; you only need that if some zany webmaster has specified that your links will all be target="_blank" by default, or something.

    Here is another kind of thing you might see:

    <p font face="Arial" color="#0000ff">

    Almost all of the font styles I see in people's html are extraneous, coming from some original Word document or something. Your designer has probably already specified (in the css) how fonts will look, so you shouldn't need to play with "font" in your html unless you want to make some lines smaller or coloured. Better: request your designer to create a style for you in the css, which is better than using "inline styles" (using code on a page-by-page basis, which is redundant, ie, bloated code).

    So, in the example above, you can shorten it to <p>.

    Examples of Non-HTML code

    Some extraneous code is recognizably NOT html at all, deriving from some programs like maybe Frontpage, and should be duly eliminated from your code, as it is useless at best and might mess up your page at worst. Here are some examples (which I'll fill in if people can provide more examples):

    <st1:place>
    <st1:city>
    Nanaimo</st1:city>
    ,
    <st1:province>British Columbia </st1:province>
    </st1:place>


    What are things like &nbsp; , &quot; , and &reg; ?

    There are characters that are hard to find on a keyboard, like ® and á. For these, there are "html entities", which are short bits of code that begin with an ampersand. In html, a "®" reads like this:

    &reg;

    and a "á" is made by using this:

    &aacute;

    Here is a convenient list of HTML entities. Just make sure you ignore the numeric codes that come up when you mouse over the boxes.

    &nbsp; in Particular

    This html entity, which simply creates a space (like the space between these two words: ragged claws), often infests a page in a very bad way. All too often, code that should look like this:

    <p>Out of the ash<br />
    I rise with my red hair<br />
    And I eat men like air.</p>

    will look like this:

    <p> &nbsp;Out of the ash&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp ;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp&nbsp;&nbsp<br />
    I rise with&nbsp;my&nbsp;red&nbsp;hair<br />
    And I&nbsp;eat&nbsp;men like&nbsp;air.</p>

    People often don't catch this, because it still looks right in the browser. Holy extra code!

    However, be warned! Sometimes you might see an &nbsp; all by itself, like this:

    <p>&nbsp;</p>

    In this case, the space may be serving a function, and you should watch the results of removing it - be ready to put it back!

    A note about linking to other pages on your site ("internal links")

    You can be economical with your code by using relative linking. When you link to an internal page (another page on your site), you don't need to write out its entire (absolute) url. You can just write a relative url, which is shorter. Example:

    <p><a href="http://www.floridastateinfo.com/pensacola/pensacola-beach-rental.php">Pensacola Beach Rentals</a></p>

    only needs to be like this:

    <p><a href="/pensacola/pensacola-beach-rental.php">Pensacola Beach Rentals</a></p>

    This also applies to images, etc, on your own site that you want to reference:

    <img src="/pensacola/beach.jpg" / >

    (more info on this by REW Fergus in another thread)

    Keep it Neat

    Now, I know I'll get some kind of flak for encouraging everyone to remove ALL whitespace in their html. I'm not saying you shouldn't keep your html readable by human webmasters as well as browsers and search engine spiders. You should still have intuitive line breaks like this:

    <h1>Yummy Hommous Recipe</h1>
    <p>However you spell hummus, its ingredients remain simple!:</p>
    <ul>
    <li>Cooked chick peas</li>
    <li>Freshly squeezed lemon juice</li>
    <li>Fresh parsley</li>
    <li>Tahini or other nutty paste</li>
    <li>Roasted red pepper (optional)</li>
    <li>Olive oil (use liberally)</li>
    <li>Fresh garlic</li>
    </ul>

    Dang - can't get the proper formatting - but there wouldn't be quite as much whitespace as you see above. No need for the line breaks between the ul's and the li's.
    Last edited by Gerry; 12-26-2006 at 02:03 PM.

  2. #2
    Join Date
    Aug 2005
    Location
    Bonita Springs, Florida
    Posts
    1,501

    Default Re: Web Writing Tips - Cleaning HTML

    Nice post! Good points. Now if I can just find the time...
    Benjamin Dona, Broker/Owner
    Gulf Coast Associates, Realtors
    Bonita Springs Real Estate | Naples Real Estate | Southwest Florida Blog

  3. Default Re: Web Writing Tips - Cleaning HTML

    Quote Originally Posted by Benjamin Dona
    Nice post! Good points. Now if I can just find the time...
    I second that! Is there a program available that will clean up extraneous code?

  4. #4
    Join Date
    Oct 2004
    Location
    Florida
    Posts
    2,132

  5. Default Re: Web Writing Tips - Cleaning HTML

    Thanks. Popular tool,- PR6!

  6. #6
    Join Date
    Oct 2005
    Posts
    160

    Default Re: Web Writing Tips - Cleaning HTML

    Nice post Gerry! One thing I sometimes do if I have a word .doc (or other messy format) that I want to turn into HTML is paste everything into the code view of Dreamweaver, then use the design view to space it out properly, using the original .doc as a reference (luckily I have 2 screens to make it easy). This way, I don't have to correct any code because only the text gets copied - i just go through and fix up the appearance - paragraph spacing, headers, quotes, etc.. usually I find it faster than trying to edit faulty code. But in other scenarios I inevitably run into many of the things you've listed, so it's good to have a reference sheet.

  7. #7
    Join Date
    Jun 2004
    Posts
    448

    Default Re: Web Writing Tips - Cleaning HTML

    I'll disagree with you about relative linking.
    Bob

  8. #8
    Join Date
    Dec 2004
    Location
    Nanaimo, BC, Canada
    Posts
    6,810

    Default Re: Web Writing Tips - Cleaning HTML

    Quote Originally Posted by seo matt
    This way, I don't have to correct any code because only the text gets copied - i just go through and fix up the appearance - paragraph spacing, headers, quotes, etc.. usually I find it faster than trying to edit faulty code.
    That sounds like a good way to deal with text that's infested. But I suppose you have to put links and images and stuff back in manually?
    Quote Originally Posted by sdhomes
    I'll disagree with you about relative linking.
    Sure thing. When would you like to do that?

  9. #9
    Join Date
    Oct 2005
    Posts
    160

    Default Re: Web Writing Tips - Cleaning HTML

    Quote Originally Posted by seogerry
    That sounds like a good way to deal with text that's infested. But I suppose you have to put links and images and stuff back in manually?
    Yeah, I put links and images back in manually, using the original code as a reference. So knowing the correct code would be good, in order to avoid duplicating the mistakes.

  10. #10
    Join Date
    Jun 2004
    Posts
    448

    Default Re: Web Writing Tips - Cleaning HTML

    Search engines don't like it when you have a lot of unnecessary code.
    Care to cite your sources on this? I know lots of high ranking pages with code bloat, including a few of my own..
    Bob

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts