Russian Charset Problem

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Russian Charset Problem

Veysel Harun Sahin
Hi,

I am hosting my site at a web hosting company. I want to use russian
charset and set my meta tags like "<meta http-equiv="content-type"
content="text/html; charset=windows-1251" />". But they don't seem to
work. When i validate my pages through w3c markup validation service
my pages' encoding seems iso-8859-9. And I see this note: "The
character encoding specified in the HTTP header (iso-8859-9) is
different from the value in the <meta> element (windows-1251). I will
use the value from the HTTP header (iso-8859-9) for this validation."

Is there any way to override the default charset and use the ones
which appear on the meta tags without modifying the server
configuration file?

Thanks in advance.

--
Veysel Harun Sahin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

Ivan Barrera A.
> I am hosting my site at a web hosting company. I want to use russian
> charset and set my meta tags like "<meta http-equiv="content-type"
> content="text/html; charset=windows-1251" />". But they don't seem to
> work. When i validate my pages through w3c markup validation service
> my pages' encoding seems iso-8859-9. And I see this note: "The
> character encoding specified in the HTTP header (iso-8859-9) is
> different from the value in the <meta> element (windows-1251). I will
> use the value from the HTTP header (iso-8859-9) for this validation."
>


Delete the AddDefaultCharset directive from httpd.conf

> Is there any way to override the default charset and use the ones
> which appear on the meta tags without modifying the server
> configuration file?
>
> Thanks in advance.
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

Ivan Barrera A.
>>
>
>
>
> Delete the AddDefaultCharset directive from httpd.conf
>

Or put your charset in that place :

AddDefaultCharset windows-1251

(sorry i forgot to put this on the other email)

>
>>Is there any way to override the default charset and use the ones
>>which appear on the meta tags without modifying the server
>>configuration file?
>>
>>Thanks in advance.
>>
>
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: [hidden email]
>    "   from the digest: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

Veysel Harun Sahin
Thanks Ivan. But i am hosting my site at a web hosting company and i
don't have access the configuration file. Is there another solution?

On 5/30/05, Ivan Barrera A. <[hidden email]> wrote:

> >>
> >
> >
> >
> > Delete the AddDefaultCharset directive from httpd.conf
> >
>
> Or put your charset in that place :
>
> AddDefaultCharset windows-1251
>
> (sorry i forgot to put this on the other email)
>
> >
> >>Is there any way to override the default charset and use the ones
> >>which appear on the meta tags without modifying the server
> >>configuration file?
> >>
> >>Thanks in advance.
> >>
> >
> >
> > ---------------------------------------------------------------------
> > The official User-To-User support forum of the Apache HTTP Server Project.
> > See <URL:http://httpd.apache.org/userslist.html> for more info.
> > To unsubscribe, e-mail: [hidden email]
> >    "   from the digest: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: [hidden email]
>    "   from the digest: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Veysel Harun Sahin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Russian Charset Problem

Denis Gerasimov

No problem.

There are at least 4 solutions:

1. Put the line

AddDefaultCharset windows-1251

To your .htaccess file (see
http://httpd.apache.org/docs-2.0/mod/core.html#adddefaultcharset)

2. Add charset support to your scripting engine.

If you use PHP you may put the folloewing line to your php.ini

default_charset = "windows-1251"

or in your .htaccess

php_value default_charset "windows-1251"

or in your PHP code

ini_set('default_charset', 'windows-1251');

(other scripting languages should support something like this too, I am
sure)

3. Send this HTTP header yourself

e.g. in PHP:

header('Content-Type: text/html; charset=windows-1251');

1. Ask your hosting provider to assist you someway :-)

HTH

Best regards,
Denis Gerasimov,
Chief Developer, VEKOS Ltd.
www.vekos.ru

>
> Thanks Ivan. But i am hosting my site at a web hosting company and i
> don't have access the configuration file. Is there another solution?
>
> On 5/30/05, Ivan Barrera A. <[hidden email]> wrote:
> > >>
> > >
> > >
> > >
> > > Delete the AddDefaultCharset directive from httpd.conf
> > >
> >
> > Or put your charset in that place :
> >
> > AddDefaultCharset windows-1251
> >
> > (sorry i forgot to put this on the other email)
> >
> > >
> > >>Is there any way to override the default charset and use the ones
> > >>which appear on the meta tags without modifying the server
> > >>configuration file?
> > >>
> > >>Thanks in advance.
> > >>
> > >
> > >



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

Arne Heizmann
In reply to this post by Veysel Harun Sahin
Veysel Harun Sahin wrote:
>
> I want to use russian charset

Why are people still creating new websites with these obsolete character
sets? Why not just use UTF-8? It's been around for long enough...


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

André Malo
* Arne Heizmann <[hidden email]> wrote:

> Veysel Harun Sahin wrote:
> >
> > I want to use russian charset
>
> Why are people still creating new websites with these obsolete character
> sets? Why not just use UTF-8? It's been around for long enough...

I can tell you the reasons for using koi8-r, euc-jp etc instead of utf-8
for the httpd docs. The resulting documents are significant smaller.
So I wouldn't consider those character encodings obsolete. They have
a purpose ;)

nd

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

Veysel Harun Sahin
Thanks Denis :)

.htaccess file has solved the problem.

Best regards.

On 5/31/05, André Malo <[hidden email]> wrote:

> * Arne Heizmann <[hidden email]> wrote:
>
> > Veysel Harun Sahin wrote:
> > >
> > > I want to use russian charset
> >
> > Why are people still creating new websites with these obsolete character
> > sets? Why not just use UTF-8? It's been around for long enough...
>
> I can tell you the reasons for using koi8-r, euc-jp etc instead of utf-8
> for the httpd docs. The resulting documents are significant smaller.
> So I wouldn't consider those character encodings obsolete. They have
> a purpose ;)
>
> nd
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: [hidden email]
>    "   from the digest: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Veysel Harun Sahin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Russian Charset Problem

Denis Gerasimov
In reply to this post by Arne Heizmann

>
> Veysel Harun Sahin wrote:
> >
> > I want to use russian charset
>
> Why are people still creating new websites with these obsolete character
> sets? Why not just use UTF-8? It's been around for long enough...
>

Okay, I also think that is a good idea (especially for multi-language
sites), but I think there are many old browsers that don't support UTF-8
documents... Does anyone have info/experience on this issue?

BTW I don't believe that UTF-8 page is *significant* larger - most of the
characters (e.g. HTML code) are still represented with one byte.

And what's about images? HTML document usually contains many images and I
suppose that impact on the traffic won't be so serious.

Please, let me know if I'm wrong.

Best regards,
Denis Gerasimov,
Chief Developer, VEKOS Ltd.
www.vekos.ru




---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

André Malo
* "Denis Gerasimov" <[hidden email]> wrote:

> BTW I don't believe that UTF-8 page is *significant* larger - most of the
> characters (e.g. HTML code) are still represented with one byte.

If you have messy HTML code, this argument happens to be true ;-)
If you have more text and each russian character takes about 3 octets
with UTF-8, it's just the wrong assumption.

nd

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Russian Charset Problem

Denis Gerasimov

>
> > BTW I don't believe that UTF-8 page is *significant* larger - most of
> the
> > characters (e.g. HTML code) are still represented with one byte.
>
> If you have messy HTML code, this argument happens to be true ;-)

I usually don't ;-)

> If you have more text and each russian character takes about 3 octets
> with UTF-8, it's just the wrong assumption.

Only 2 octets in fact. And I really think that HTML pages with lots of text
are not user-friendly.

Thanks for you opinion.

>
> nd
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: [hidden email]
>    "   from the digest: [hidden email]
> For additional commands, e-mail: [hidden email]




---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

Arne Heizmann
In reply to this post by André Malo
Andr?? Malo wrote:
>
> I can tell you the reasons for using koi8-r, euc-jp etc instead of utf-8
> for the httpd docs. The resulting documents are significant smaller.

ru:        15169 => 20713
ru+gzip:    5454 =>  6160

ja:        14063 => 16595
ja+gzip:    4833 =>  5237

Hardly "significantly smaller".

Especially considering that you are limiting yourself to a very small
set of characters. As a result, you have to put the ugly hacky "ru" and
"ja" on the pages rather than the proper "??????????????" and "?????????" which
users are more likely to recognise. Yes, I know you can use numerical
entities in HTML to achieve this nonetheless, but the more you use
those, the less of a "benefit" your legacy encoding becomes.

Timwi


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Russian Charset Problem

André Malo
* Arne Heizmann <[hidden email]> wrote:

> > I can tell you the reasons for using koi8-r, euc-jp etc instead of utf-8
> > for the httpd docs. The resulting documents are significant smaller.
>
> ru:        15169 => 20713
> ru+gzip:    5454 =>  6160
>
> ja:        14063 => 16595
> ja+gzip:    4833 =>  5237

Uhm, what do these numbers refer to?

> Especially considering that you are limiting yourself to a very small
> set of characters. As a result, you have to put the ugly hacky "ru" and
> "ja" on the pages rather than the proper "Русский" and "日本語" which
> users are more likely to recognise.

Nope, the iso-tokens are chosen as linktext on purpose. The native language
names are in the title (or should be there at least, depends on the translator,
however).

[note that my mail client here can't recognize utf-8 properly, I'm leaving it
as is ...]

> Yes, I know you can use numerical
> entities in HTML to achieve this nonetheless, but the more you use
> those, the less of a "benefit" your legacy encoding becomes.

As a matter of fact, numeric character references are rare within the httpd docs.

nd

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [hidden email]
   "   from the digest: [hidden email]
For additional commands, e-mail: [hidden email]