![]() |
|
|
|||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Languages and Character Sets Challenges faced when handling different characters sets |
![]() |
| LinkBack | Thread Tools | Display Modes |
|
|||
|
I haven't done much work on non-English-language sites, yet, despite being from Germany, so now I have to ask a very basic question:
Is the language declaration in the html-element important? Or is it not really necessary? Does it help with local rankings/help prevent issues with the SEs? I guess every major SE probably understands that a website is supposed to rank for an English language search if the majority (or all) of its content is written in English, anyway. But maybe it's important if you use a different language on the page (for whatever reason)? I assume this here is enough <html lang=".."> or isn't it? thanks! |
|
|
|||
|
Hi,
that's an interesting question. My experience shows that a declaration isn't necessary. That is, you need not to declare the language you are using provided the entire text is in one specific language. So for example if you have a multilingual website you need not to worry as your content is organized in several areas each in a language and that's no problem. If you have a multilingual blog and you are posting in different languages you'll want to avoid mixing content in different languages so you'll create multiple categories dealing with a subject, the number of categories equating the number of languages. The only case I'd see appropriate (and necessary) to use the language attribute is if you have the need to mix more languages within the body of one page in order to be sure that the search engines don't get mixed up - they really might not, but better safe than sorry. It would ne nice to hear of other experiences related to this aspect
|
|
|||
|
thanks for the reply,
I asked this question because I was reminded of my first attempt to launch a website: I put my site on my hosting account, however the design was still not done and for content areas I had just put some random german phrase in them which I copied dozens of times to make the design look a bit..less empty .Then I was suggested by someone on an SEO forum, that I should use a language declaration, because I might have a problem with search engines now (as the site which was intended to be in English, had German words on it). Do you think this would have been a problem if I had not added ..lang ="en"? Probably hard/impossible to tell w/o trying it (which would be a bad use of one's time lol), hm? I was just curious, because I happened to think of that and thought multilingual SEOs might run into that more often than those who only SEO in English anyway
|
|
|
|||
|
The Language declaration, along with all other similar attributes should be viewed as suggestions to the search engines, a helping hand to help them figure things out on your webiste and nothing more. They do as they please and have a mind of their own. Another attribute tha makes me laugh is the REVISIT AFTER - as if teh Search Engines are going to obey our command to revisit our websites based on our request - that's not going to happen ...
|
|
|||
|
Hi guys,
I use lang= declaration across all my 16 properties, I use lang settings like en-CA, en-IE, en-AUS, zh-CN and others... When I launch new site I set targeting in Google webmaster tools for that country, later on I switch it off when enough local links are pointing to it. Ideally you would use local TLDs for it but sometimes u cant ![]() Cheers |
|
|||
|
Should be viewed as suggestions to the search engines, a helping hand to help them figure things out on your webiste and nothing more. They do as they please and have a mind of their own. Another attribute tha makes me laugh is the REVISIT
Last edited by sjachille; 09-10-09 at 07:41 AM. |
|
|||
|
1. Declare the language as html attribute:
<html lang="en">, or <html lang="fr-CA"> Syntax: lang="primary code - subcode" where the primary code is represented by two-letter language abbreviations ( e.g.: fr=French, en=English, ar=Arabic, zh=Chinese, etc. See here the entire list The subcode is the two letter country code (e.g.: CA=Canada, US=United States or America, CN=China, HK=Hong-Kong, etc). List with the country codes 2. Declare the language in the document head: <meta http-equiv="Content-Language" content="en,fr,sp" /> 3. Declare the language within the document: E.g.: The French translation for <strong>thank you</strong> is <strong lang="fr">merci</strong> The French translation for thank you is merci Character encoding <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> Some of the popular character encoding are ISO 8859 series, Unicode, Big5, Gubiano (gb2312), etc. The most common encoding, ISO 8859-1 (Latin 1) will work perfect for English and French and other roman based languages. Arabic characters could be encoded with 'Windows-1256' or 'ISO 8859-6' or 'UTF-8'. Chinese characters will display properly if also using the GB2312 for simplified Chinese characters and Big5 for traditional Chinese characters ( Taiwan and Hong Kong). If there are more languages on the same web page, UTF-8 is recommended as it supports most of the characters. Choosing the right encoding is a very important part of creating an web page. These Chinese characters 五一�� 二五六 三三���� will look different if not using the right character encoding. This is page encoding is UTF-8 and the Chinese text looks OK. Using the wrong encoding will display only question marks instead of Chinese text. E.g.: Using iso-8859-1 encoding to display Chinese and English characters: |
|
|||
|
There's some very useful information there and you should definitely use the language tag when you can. However, search engines do not rely on it much for one very simple reason: so few people use it it's not very accurate. The vast majority of sites have their language tag set to 'en' - even when the language is not English. I once did a test of European Government sites and virtually all had the 'en' tag. If you're a search engine, that means it's not a very good signal and you have to rely on something else. That something else is most likely to be a local domain, clear un-mixed language content and inbound links from sites in the targeted language.
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|